CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 232
Emulate VSFilter’s bidi and \fay by default; add an ASS_Feature for better bidi and shaping #441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4245d07
to
7da31cf
Compare
Also, I think it makes more sense to run bidi on the whole event, rather than on each explicitly-delimited line (as we do currently), no? The question is, given:
is it more likely to mean
Because our current code (with Encoding=-1, otherwise there’s nothing to discuss) does 2. |
In fact, Unicode seems to mandate the latter behaviour, saying the text must first be split on paragraph separators, among which it lists the line feed, and then each paragraph’s base direction must be separately autodetected. This is not what FriBidi does though: it does split paragraphs on line feeds, but it runs direction detection only once for the whole input buffer. And I still think that makes more sense for our use case. |
52883d2
to
f3b3219
Compare
Pushed several additional fixes for bidi and shaping (and fixed the Travis CI build failure). And renamed the feature to This improves both our compatibility with VSFilter when the new feature is off (I think we should be identical now, barring shaping differences between Uniscribe and HarfBuzz) and our rendering when the new feature is on. |
Maybe call it |
Two reasons I didn’t:
|
f3b3219
to
267b493
Compare
Since the last push, this PR additionally affects baseline reset with
Perhaps with As for the default setting: How about we make it default to whole-line bidi for Unfortunately, being a left-to-right user myself, I don’t know (and haven’t investigated yet) whether this compromise would help in practice. But it sounds like it might. |
I think I agree, this look bad #148 also here https://web.archive.org/web/20160120113701/https://code.google.com/p/libass/issues/detail?id=114 it can be happen even in English
seems fine, I really hope to fix those soon so I can let vsfilter RIP and only use libass |
using assrender https://github.com/pinterf/assrender , both avs+ and assrender should work in Linux nowadays (I did use windows) and since its easy to build and test using avspmod now (at least for me) I did some tests using the sample from #71
BUT #374 now look like this |
Thanks! I can confirm that the sample in #374 is broken and I’ve tracked down the bug. The specific sample in #374, with the explicit line break, will now be fixed. But this has highlighted a problem. We run bidi, then wrap lines. But each line is its own run and needs its own bidi. I don’t know how to fix this reliably. |
For example: take the sample from #374 and replace the line break by a space. Use a font large enough that the automatic line wrapping breaks the line in the same place as before. VSFilter will show the same result that is seen in #374, but libass—both with and without this PR—will show the output in the screenshot above. |
4f34787
to
0279952
Compare
I’ve pushed a new version:
What remains is to somehow re-bidi runs that line wrapping splits into multiple parts. An obvious, dumb approach is to simply rerun bidi and shaping on such runs after line wrapping. But it may be possible that this changes their widths and invalidates the line wrapping itself. |
it's seems fine now, I will do more test later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first look at everything but the \fay-changes; will lokk at them later, possibly tomorrow.
For the parts I did look at, there are only some style nits. In general:
- I agree on running bidi per paragraph.
- I'm not completly happy with the name
WHOLE_TEXT
, as it's a bit too unspecific and doesn't quite fit, as eg bidi happens per paragraph. But I'm also lackin
g any better ideas, so … - I'm not sure why you wnat to re-bidi after line wrapping
As I understand it (after skimming through) the Unicode/WebVTT standards require BiDi per paragraph not per line. So only forced line breaks should matte
here.
If it's about compatability, then the obvious answer to how to do this is to just do the same as VSF sans UB, non-terminating loops etc.
Can you rebase this, throwing out the commits from #445, so we can run regressions test on it?
Thanks for the (initial) review! I’ve rebased on master and thrown out the duplicate commits. In retrospect, I should’ve done this before deleting the But why do we want leading operators again? We have 32 instances of leading operators, in 21 distinct consecutive-line blocks, as per this regex: ^\s*([|^%>]|/[^/*]|\+[^+]|-[^-]|&[& ]|< ) including 11 operators in 3 line blocks in We have 165 instances of trailing operators, in 88 distinct consecutive-line blocks, as per this analogous regex: ([|^%<]|[^/*]/|[^+]\+|[^-]-|[& ]&| >)\s*$ including 11 operators in 5 line blocks in |
ead32a1
to
d688344
Compare
Were there any blockers/todos left for this? (Apart from it merge-conflicting with the also pending #545) |
Todos yes: rerun bidi after line wrapping somehow. But I don’t think it should be a blocker, as the current PR still provides a strict improvement in VSFilter compatibility and hopefully covers most of the real-world cases. Plus bikesheds: about the feature name and about the feature mechanism itself. |
iirc none of us could come up with a better name than |
97730c4
to
c4810fd
Compare
Quick summary of our naming bikeshed brainstorm a while ago on IRC:
|
In absence of any native speaker feedback on “semaless”, does |
hi, sorry for my late (I've been busy for over a month) I have no problem with SEAMLESS_TEXT_LAYOUT, if it about "funny" then I think it already is in ".ass" |
c4810fd
to
56f8987
Compare
Rebased on top of current |
The range-diff of the rebase seems perfectly fine to me and ART also finishes as expected. |
Fixes https://code.google.com/archive/p/libass/issues/111. Fixes libass#226 if it was not fixed by commit c93cb3d already. The old ("sane") behavior of running bidi on each event as a whole* can be restored by setting the new ASS_FEATURE_WHOLE_TEXT_LAYOUT. Additionally, in a nod to ASS files and on-the-fly format converters that use libass's nonstandard Encoding -1 for bidi base direction autodetection, make Encoding -1 force WHOLE_TEXT_LAYOUT processing for individual events even when the ASS_Feature is disabled. * Note: we treat explicit line breaks \N as paragraph separators, and effectively bidi is run on each "paragraph" separately. This is standard bidi behavior. Of particular relevance to libass, this is exactly what WebVTT mandates.
When WHOLE_TEXT_LAYOUT is disabled, keep calling HarfBuzz with no context at all. It might seem sensible to give it the whole VSFilter-run as context so it could shape Arabic correctly across font fallback in case of weird fonts, but it seems that Uniscribe doesn't behave like this in VSFilter. For testing that, I created a font with only two glyphs: an isolated ain and a medial ain; and a GSUB medi lookup that links them together. Setting \fnMyTestFont on an Arabic word with a medial ain, libass with WHOLE_TEXT_LAYOUT (i. e. HarfBuzz with context) showed the medial shape, but VSFilter (i. e. Uniscribe) showed the isolated shape. BorderStyle 3 confirmed that VSFilter was treating the string as a single run and therefore passing it to GDI/Uniscribe as a whole.
This can be reverted by enabling ASS_FEATURE_WHOLE_TEXT_LAYOUT.
These calls don't perform any bidi or compute anything clever. They just fetch some Unicode properties for each code point.
Without WHOLE_TEXT_LAYOUT, follow VSFilter and reset the baseline for each run. With WHOLE_TEXT_LAYOUT, try to shear each line as a whole. There are still issues though: * \fscx0 glyphs are skipped and do not contribute any shear; * applying shear to each glyph and then scaling it together with the shear makes little sense for whole lines to begin with.
There used to be a FIXME comment about this, but it was removed in commit 27cc033. Reordering works fine without this *most of the time*, but this is still a bug. It affects one thing only: trailing whitespace in each line/run. Of course, we remove trailing U+0020 SPACE characters from each line regardless of bidi, but Unicode bidi recognizes other whitespace, and we can now also have trailing U+0020 within in-line runs. With right-to-left base direction, all trailing whitespace within each line (after line wrapping) or run are to be placed at the leftmost end in visual order (even if that whitespace is nominally part of a left-to-right embed). Conversely, with our current code that always tells FriBidi the base direction is left-to-right here, trailing whitespace is always placed at the rightmost end of each line, even if it is inside right-to-left text (which makes it appear as *leading* whitespace to the reader). We treat explicit line breaks \N as paragraph separators, where each paragraph can have its own base direction. Therefore, we must remember each paragraph's resolved direction separately.
56f8987
to
cc54eea
Compare
@astiob: I didn't notice this before, sorry, but shouldn't this have bumped |
Indeed, thank you. Pushed. |
Guess what.