patch 9.0.1617: charidx() result is not consistent with byteidx()

Problem:    charidx() and utf16idx() result is not consistent with byteidx().
Solution:   When the index is equal to the length of the text return the
            lenght of the text instead of -1. (Yegappan Lakshmanan,
            closes #12503)
This commit is contained in:
Yegappan Lakshmanan
2023-06-08 17:09:45 +01:00
committed by Bram Moolenaar
parent 5bf042810b
commit 577922b917
6 changed files with 132 additions and 52 deletions

View File

@ -1528,11 +1528,13 @@ charidx({string}, {idx} [, {countcc} [, {utf16}]])
When {utf16} is present and TRUE, {idx} is used as the UTF-16
index in the String {expr} instead of as the byte index.
Returns -1 if the arguments are invalid or if {idx} is greater
than the index of the last byte in {string}. An error is
given if the first argument is not a string, the second
argument is not a number or when the third argument is present
and is not zero or one.
Returns -1 if the arguments are invalid or if there are less
than {idx} bytes. If there are exactly {idx} bytes the length
of the string in characters is returned.
An error is given and -1 is returned if the first argument is
not a string, the second argument is not a number or when the
third argument is present and is not zero or one.
See |byteidx()| and |byteidxcomp()| for getting the byte index
from the character index and |utf16idx()| for getting the
@ -10119,8 +10121,8 @@ uniq({list} [, {func} [, {dict}]]) *uniq()* *E882*
<
*utf16idx()*
utf16idx({string}, {idx} [, {countcc} [, {charidx}]])
Same as |charidx()| but returns the UTF-16 index of the byte
at {idx} in {string} (after converting it to UTF-16).
Same as |charidx()| but returns the UTF-16 code unit index of
the byte at {idx} in {string} (after converting it to UTF-16).
When {charidx} is present and TRUE, {idx} is used as the
character index in the String {string} instead of as the byte
@ -10128,6 +10130,10 @@ utf16idx({string}, {idx} [, {countcc} [, {charidx}]])
An {idx} in the middle of a UTF-8 sequence is rounded upwards
to the end of that sequence.
Returns -1 if the arguments are invalid or if there are less
than {idx} bytes in {string}. If there are exactly {idx} bytes
the length of the string in UTF-16 code units is returned.
See |byteidx()| and |byteidxcomp()| for getting the byte index
from the UTF-16 index and |charidx()| for getting the
character index from the UTF-16 index.