Problem: fuzzy-matching can be improved
Solution: Implement a better fuzzy matching algorithm
(Girish Palya)
Replace fuzzy matching algorithm with improved fzy-based implementation
The
[current](https://www.forrestthewoods.com/blog/reverse_engineering_sublime_texts_fuzzy_match/)
fuzzy matching algorithm has several accuracy issues:
* It struggles with CamelCase
* It fails to prioritize matches at the beginning of strings, often
ranking middle matches higher.
After evaluating alternatives (see my comments
[here](https://github.com/vim/vim/issues/17531#issuecomment-3112046897)
and
[here](https://github.com/vim/vim/issues/17531#issuecomment-3121593900)),
I chose to adopt the [fzy](https://github.com/jhawthorn/fzy) algorithm,
which:
* Resolves the aforementioned issues.
* Performs better.
Implementation details
This version is based on the original fzy
[algorithm](https://github.com/jhawthorn/fzy/blob/master/src/match.c),
with one key enhancement: **multibyte character support**.
* The original implementation supports only ASCII.
* This patch replaces ascii lookup tables with function calls, making it
compatible with multibyte character sets.
* Core logic (`match_row()` and `match_positions()`) remains faithful to
the original, but now operates on codepoints rather than single-byte
characters.
Performance
Tested against a dataset of **90,000 Linux kernel filenames**. Results
(in milliseconds) show a **\~2x performance improvement** over the
current fuzzy matching algorithm.
```
Search String Current Algo FZY Algo
-------------------------------------------------
init 131.759 66.916
main 83.688 40.861
sig 98.348 39.699
index 109.222 30.738
ab 72.222 44.357
cd 83.036 54.739
a 58.94 62.242
b 43.612 43.442
c 64.39 67.442
k 40.585 36.371
z 34.708 22.781
w 38.033 30.109
cpa 82.596 38.116
arz 84.251 23.964
zzzz 35.823 22.75
dimag 110.686 29.646
xa 43.188 29.199
nha 73.953 31.001
nedax 94.775 29.568
dbue 79.846 25.902
fp 46.826 31.641
tr 90.951 55.883
kw 38.875 23.194
rp 101.575 55.775
kkkkkkkkkkkkkkkkkkkkkkkkkkkkk 48.519 30.921
```
```vim
vim9script
var haystack = readfile('/Users/gp/linux.files')
var needles = ['init', 'main', 'sig', 'index', 'ab', 'cd', 'a', 'b',
'c', 'k',
'z', 'w', 'cpa', 'arz', 'zzzz', 'dimag', 'xa', 'nha', 'nedax',
'dbue',
'fp', 'tr', 'kw', 'rp', 'kkkkkkkkkkkkkkkkkkkkkkkkkkkkk']
for needle in needles
var start = reltime()
var tmp = matchfuzzy(haystack, needle)
echom $'{needle}' (start->reltime()->reltimefloat() * 1000)
endfor
```
Additional changes
* Removed the "camelcase" option from both matchfuzzy() and
matchfuzzypos(), as it's now obsolete with the improved algorithm.
related: neovim/neovim#34101
fixes #17531
closes: #17900
Signed-off-by: Girish Palya <girishji@gmail.com>
Signed-off-by: Christian Brabandt <cb@256bit.org>
If you find a bug or want to discuss the best way to add a new feature, please open an issue. If you have a question or want to discuss the best way to do something with Vim, you can use StackExchange or one of the Maillists.
What is Vim?
Vim is a greatly improved version of the good old UNIX editor
Vi. Many new
features have been added: multi-level undo, syntax highlighting, command line
history, on-line help, spell checking, filename completion, block operations,
script language, etc. There is also a Graphical User Interface (GUI)
available. Still, Vi compatibility is maintained, those who have Vi "in the
fingers" will feel at home.
See runtime/doc/vi_diff.txt for differences with
Vi.
This editor is very useful for editing programs and other plain text files. All commands are given with normal keyboard characters, so those who can type with ten fingers can work very fast. Additionally, function keys can be mapped to commands by the user, and the mouse can be used.
Vim also aims to provide a (mostly) POSIX-compatible vi implementation, when compiled with a minimal feature set (typically called vim.tiny), which is used by many Linux distributions as the default vi editor.
Vim runs under MS-Windows (7, 8, 10, 11), macOS, Haiku, VMS and almost all flavours of UNIX. Porting to other systems should not be very difficult. Older versions of Vim run on MS-DOS, MS-Windows 95/98/Me/NT/2000/XP/Vista, Amiga DOS, Atari MiNT, BeOS, RISC OS and OS/2. These are no longer maintained.
For Vim9 script see README_VIM9.
Distribution
You can often use your favorite package manager to install Vim. On Mac and Linux a small version of Vim is pre-installed, you still need to install Vim if you want more features.
There are separate distributions for Unix, PC, Amiga and some other systems.
This README.md file comes with the runtime archive. It includes the
documentation, syntax files and other files that are used at runtime. To run
Vim you must get either one of the binary archives or a source archive.
Which one you need depends on the system you want to run it on and whether you
want or must compile it yourself. Check https://www.vim.org/download.php for
an overview of currently available distributions.
Some popular places to get the latest Vim:
- Check out the git repository from GitHub.
- Get the source code as an archive.
- Get a Windows executable from the vim-win32-installer repository.
Compiling
If you obtained a binary distribution you don't need to compile Vim. If you
obtained a source distribution, all the stuff for compiling Vim is in the
src directory. See src/INSTALL for instructions.
Installation
See one of these files for system-specific instructions. Either in the READMEdir directory (in the repository) or the top directory (if you unpack an archive):
README_ami.txt Amiga
README_unix.txt Unix
README_dos.txt MS-DOS and MS-Windows
README_mac.txt Macintosh
README_haiku.txt Haiku
README_vms.txt VMS
There are other README_*.txt files, depending on the distribution you used.
Documentation
The Vim tutor is a one hour training course for beginners. Often it can be
started as vimtutor. See :help tutor for more information.
The best is to use :help in Vim. If you don't have an executable yet, read
runtime/doc/help.txt.
It contains pointers to the other documentation files.
The User Manual reads like a book and is recommended to learn to use
Vim. See :help user-manual.
Copying
Vim is Charityware. You can use and copy it as much as you like, but you are
encouraged to make a donation to help orphans in Uganda. Please read the file
runtime/doc/uganda.txt
for details (do :help uganda inside Vim).
Summary of the license: There are no restrictions on using or distributing an unmodified copy of Vim. Parts of Vim may also be distributed, but the license text must always be included. For modified versions, a few restrictions apply. The license is GPL compatible, you may compile Vim with GPL libraries and distribute it.
Sponsoring
Fixing bugs and adding new features takes a lot of time and effort. To show your appreciation for the work and motivate developers to continue working on Vim please send a donation.
The money you donated will be mainly used to help children in Uganda. See
runtime/doc/uganda.txt. But at the same time
donations increase the development team motivation to keep working on Vim!
For the most recent information about sponsoring look on the Vim web site: https://www.vim.org/sponsor/
Contributing
If you would like to help make Vim better, see the CONTRIBUTING.md file.
Information
If you are on macOS, you can use MacVim.
The latest news about Vim can be found on the Vim home page: https://www.vim.org/
If you have problems, have a look at the Vim documentation or tips: https://www.vim.org/docs.php https://vim.fandom.com/wiki/Vim_Tips_Wiki
If you still have problems or any other questions, use one of the mailing lists to discuss them with Vim users and developers: https://www.vim.org/maillist.php
If nothing else works, report bugs directly to the vim-dev mailing list:
<vim-dev@vim.org>
Main author
Most of Vim was created by Bram Moolenaar <Bram@vim.org>
Bram-Moolenaar
Send any other comments, patches, flowers and suggestions to the vim-dev mailing list:
<vim-dev@vim.org>
This is README.md for version 9.1 of Vim: Vi IMproved.
