Provides an implementation of a state machine for validating UTF-8 encoded strings. Clients may request that encoding errors be reported in several ways:
simple true / false indicator
a raised exception
UTF-8 Encoding
UTF-8 Decoding
That functionality is left as an exercise for the reader.
The Unicode Consortium | At unicode.org/ for all the information published there. |
Frank Yung-Fong Tang | For the state machine algorithm. See: unicode.org/mail-arch/unicode-ml/y2003-m02/att-0467/01-The_Algorithm_to_Valide_an_UTF-8_String |
Markus Kuhn | For invalid test data. www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt |
It is expected that this validator will be used in Ruby environments prior to 1.9.x. However, nothing prohibits it’s use with Ruby 1.9.
Please report issues on the tracker at github.
Check out the latest master to make sure the feature hasn’t been implemented or the bug hasn’t been fixed yet.
Check out the issue tracker to make sure someone already hasn’t requested it and/or contributed it.
Fork the project.
Start a feature/bugfix branch.
Commit and push until you are happy with your contribution.
Make sure to add tests for it. This is important so it does not break in in a future version unintentionally.
Please try not to modify the Rakefile or VERSION file. If you require your own version please isolate the version update to its own commit so cherry-pick or rebase can be used to skip it.
Request a pull.
Copyright © 2011 Guy Allard. See LICENSE.txt for further details.
Generated with the Darkfish Rdoc Generator 1.1.6.