You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I get your point, but this problem is not inherent to text nodes. It could affect everything which contains characters that are not specified in ASCII (attributes, CDATA, comments, ...)
Also I looked at the implementation of high5. The concatenating of text events you mentioned is done by concatenating multiple string chunks to one string "buffer" (this._buffer)
Unfortunately it will not solve this problem, because
newBuffer([0xE2,0x82]).toString()+newBuffer([0xAC]).toString()!=='β¬'//results in 'οΏ½οΏ½οΏ½' instead
When working with Buffers in a streaming fashion you have to use StringDecoder to get utf8 right
I think @ajafff is very right. I ran into trouble when parsing web pages, especially since the behaviour is quite unpredictable because a cut right between the two bytes happens quite rarely.
I think it would be beneficial to add this information to the /wiki/Parser-options. Would have saved me some troubles at least.
EDIT:
Ok, I am not sure if this is how it is supposed to go, but I just went ahead and wrote the note myself :-).
I forgot about this, sorry. This needs a test case in the test dir (as a new file β have a look at api.js) and quotes have to be double quotes (that's why the tests fail). Looks good otherwise.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://stackoverflow.com/questions/12121775/convert-streamed-buffers-to-utf8-string