Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to find org.jcodings.specific.BaseUTF8Encoding.mbcCaseFold #25

Closed
ahorek opened this issue Oct 1, 2018 · 8 comments
Closed

Unable to find org.jcodings.specific.BaseUTF8Encoding.mbcCaseFold #25

ahorek opened this issue Oct 1, 2018 · 8 comments

Comments

@ahorek
Copy link
Contributor

ahorek commented Oct 1, 2018

hi @lopex
the recent mail build started to fail
https://travis-ci.org/mikel/mail/jobs/435704866
https://github.com/mikel/mail

not sure if the problem is in joni or jcodings. If you have time, please take a look, thanks.

Failure/Error: Unable to find org.jcodings.specific.BaseUTF8Encoding.mbcCaseFold(BaseUTF8Encoding.java to read failed line
     
     Java::JavaLang::ArrayIndexOutOfBoundsException:
       -2
     # org.jcodings.specific.BaseUTF8Encoding.mbcCaseFold(BaseUTF8Encoding.java:152)
     # org.jcodings.specific.UTF8Encoding.mbcCaseFold(UTF8Encoding.java:22)
     # org.joni.Search.lowerCaseMatch(Search.java:42)
     # org.joni.Search.access$000(Search.java:27)
     # org.joni.Search$11.search(Search.java:439)
     # org.joni.Matcher.forwardSearchRange(Matcher.java:137)
     # org.joni.Matcher.searchCommon(Matcher.java:425)
     # org.joni.Matcher.search(Matcher.java:301)
     # org.jruby.RubyRegexp.matcherSearch(RubyRegexp.java:231)
     # org.jruby.RubyRegexp.search(RubyRegexp.java:1306)
     # org.jruby.RubyRegexp.matchPos(RubyRegexp.java:1195)
     # org.jruby.RubyRegexp.op_match(RubyRegexp.java:1113)
     # org.jruby.RubyString.op_match(RubyString.java:1656)
     # org.jruby.RubyString$INVOKER$i$1$0$op_match.call(RubyString$INVOKER$i$1$0$op_match.gen)
     # org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
     # home.travis.build.mikel.mail.lib.mail.encodings.invokeOther6:=~(/home/travis/build/mikel/mail/lib/mail/encodings.rb:125)
     # home.travis.build.mikel.mail.lib.mail.encodings.RUBY$method$value_decode$0(/home/travis/build/mikel/mail/lib/mail/encodings.rb:125)
...
@lopex
Copy link
Contributor

lopex commented Oct 2, 2018

reduced case:

"\u{1F48C}" =~ /\=\?/i

@lopex
Copy link
Contributor

lopex commented Oct 2, 2018

This is related to jruby/joni#17, Onigmo appears to compare first two bytes of "\u{1F48C}" to "=?" in exact info regexp field (used by fast skip algorithms). It uses for that mbclen(enc, p, end) function aka onigenc_mbclen_approximate which will never return negative values and acts as a safeguard for broken characters.

@lopex
Copy link
Contributor

lopex commented Oct 2, 2018

The issue was introduced with jruby/joni@012bb20 which turned on Search.BM_IC fast skip boyer-moore / sunday case insensitive search routine. The problem doesnt seem to be in the routine itself, but how case insensitive comparison is being handled. Until we find the solution we can fallback to Search.SLOW_IC for now.

@lopex
Copy link
Contributor

lopex commented Oct 3, 2018

Temporary fix is in jruby/joni@118dbde which will not degrade performance from previous versions. Keeping the issue open until we decide on adding unsave and approximate length routines to org.jcodings.Encoding.

lopex added a commit to jruby/jruby that referenced this issue Oct 3, 2018
@lopex
Copy link
Contributor

lopex commented Oct 3, 2018

@ahorek joni is released and jruby snaps updated, thanks for the report.

@headius
Copy link
Member

headius commented Apr 25, 2019

@lopex Is there a further fix needed here?

@lopex
Copy link
Contributor

lopex commented Apr 25, 2019

The ultimate fix would be to implement approximate length for our encodings. For now, as a workaround, Sunday search is turned off for case insensitive forward searches.

@lopex
Copy link
Contributor

lopex commented Apr 25, 2019

Closing, created a new issue that explains it here #26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants