You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yuscii is a little library to decode an UTF-7 (RFC2152 for instance)
input flow to Unicode. This library does not implement an encoder because, Eh
guy, we are in 2018...
How to use it?
yuscii follows the same design than uutf or some others libraries with
the same purpose: translate something to Unicode. We need to be able to control
memory-consumption and ensure to offer a non-blocking computation. Finally, an
error should not stop the process of the decoding.
This is a little example with uutf to translate UTF-7 to UTF-8:
lettransicoc=let decoder =Yuscii.decoder (`Channel ic) inlet encoder =Uutf.encoder `UTF_8 (`Channel oc) inletrec go()=matchYuscii.decode decoder with|`Await -> assertfalse(* XXX(dinosaure): impossible when you use `String of `Channel as source. *)|`Uchar_asuchar -> ignore @@Uutf.encode encoder uchar ; go ()|`End -> ignore @@Uutf.encoder `End|`Malformederr -> failwith err in
go ()let()= trans stdin stdout
About UTF-7
SMTP protocol, for historical reasons is not necessary 8-bit clean protocol.
In others words, SMTP may only support 7-bit data - and a 8-bit message had high
chances to be garbled during transmission.
For this purpose, UTF-7 exists and provide a way to encode a message under this
limit. The advantage of UTF-7 if we compare with the quoted-printable
encoding or the base64 encoding (RFC2045), is the size where UTF-8 combined
with quoted-printable produces a very size-inefficient flow.
Of course, nobody uses it...
About RFC2060
We rely only on RFC2152 where IMAP has his own UTF-7 and this package does not
want to handle both - in others words, if you want to decode an IMAP UTF-7 flow
(a mUTF-7), you probably should use something else than this library.
About encoding
As we said, nobody continues to use UTF-7 (0.002 % according w3techs)
and this library is just an excuse to lost our times. So, the encoding is
definitely not a part of our plan and if you really want to encode something to
UTF-7, you are probably wrong.
A larger decoder
As a part of the mrmime project, yuscii is used by
rosetta has an higher decoder of a larger set of encodings. You
probably want to use it to decode everythings.
Distribution
yuscii integrates a little binary to translate UTF-7 flow to UTF-8:
yuscii.to_utf8. It is provided as an example of how to use yuscii with uutf.
Did you know?
YUSCII is a 7-bit character encoding used in Yugoslavia. That's all...