You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These "last_name suffix, first_name" names generate an empty array.
Gump Jr., Bubba
Gump Jr., Bubba B.
Gump II, Bubba
Gump Jr., B.B.
However, when split-reverse-joined, the names parse correctly.
Bubba Gump Jr.
Bubba B. Gump Jr.
Bubba Gump II
B.B. Gump Jr.
This is not full-proof, because some names are formatted "first_name last_name, suffix". My conversion code below would actually make the suffix the first name.
We could probably add those patterns to the display order grammar -- the fact that there is no comma means that it will only work for suffices which are recognized by the tokenizer of course (which in this case I think they would be).
Do you think you can do that?
Just splitting on commas is not general solution, because we need to support multiple last names which are quite common in languages like Spanish or Portuguese.
These "last_name suffix, first_name" names generate an empty array.
Gump Jr., Bubba
Gump Jr., Bubba B.
Gump II, Bubba
Gump Jr., B.B.
However, when split-reverse-joined, the names parse correctly.
Bubba Gump Jr.
Bubba B. Gump Jr.
Bubba Gump II
B.B. Gump Jr.
This is not full-proof, because some names are formatted "first_name last_name, suffix". My conversion code below would actually make the suffix the first name.
My code used for correcting the above names
The text was updated successfully, but these errors were encountered: