Arc Forumnew | comments | leaders | submitlogin
9 points by olavk 6176 days ago | link | parent

Or just one type of string: unicode character strings (which is sequences of unicode code points). Then a seperate type for byte arrays. Byte arrays are not character strings, but can easily be translated into a string (and back).


3 points by olavk 6176 days ago | link

...and this seem to be exactly what MzScheme provides :-) Strings in MzScheme are sequences of unicode code points. "bytes" is a seperate type which is a sequence of bytes. There are functions to translate between the two, given an encoding.

Python 3000 is close to this, but I think Python muddles the issue by providing character-releated operations like "capitalize" and so on on byte arrays. This is bound to lead to confusion. (The reason seem to be that the byte array is really the old 8-bit string type renamed. Will it never go away?) MzScheme does not have that issue.

-----