-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LWTA abbreviation support #12109
base: main
Are you sure you want to change the base?
Conversation
…ed (e.g. turn international into int. instead of int.al)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JUnit tests are failing. In the area "Some checks were not successful", locate "Tests / Unit tests (pull_request)" and click on "Details". This brings you to the test output.
You can then run these tests in IntelliJ to reproduce the failing tests locally. We offer a quick test running howto in the section Final build system checks in our setup guide.
We MUST NOT include this file in the source code tree. We need a gradle action downloading the file to |
@@ -22,6 +22,8 @@ public class AbbreviationParser { | |||
// Ensures ordering while preventing duplicates | |||
private final LinkedHashSet<Abbreviation> abbreviations = new LinkedHashSet<>(); | |||
|
|||
private final LinkedHashSet<LwtaAbbreviation> lwtaAbbreviations = new LinkedHashSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subclass the AbbreviationParser, because LWTA is a separate functionality (too little coupoling with other methods). someone will say that there should be composition over inheritance - not sure which method you really need of this class.
String name = csvRecord.size() > 0 ? csvRecord.get(0) : ""; | ||
String abbreviation = csvRecord.size() > 1 ? csvRecord.get(1) : ""; | ||
|
||
// Check name and abbreviation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove comment - this is clear from the statement.
if (string.endsWith("-")) { | ||
string = string.substring(0, string.length() - 1); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use
string = org.jabref.model.strings.StringUtil.removeStringAtTheEnd(string, "-");
(replace by normal import - i just want tos show you the package)
private final boolean allowsPrefix; | ||
|
||
enum Position { | ||
ENDS_WORD, STARTS_WORD, IN_WORD, FULL_WORD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort more logically: full, starts, in, end
@@ -0,0 +1,41 @@ | |||
package org.jabref.logic.journals; | |||
|
|||
public class LwtaAbbreviation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor to record
|
||
private final Map<String, LwtaAbbreviation> lwtaToAbbreviationObject; | ||
|
||
// incomplete list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is that for an "incomplete" list. What should one do to add words? Either state that or remove the comment.
/** | ||
* instantiates this class with a csv file | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove comment - or replace it by an explenation of the @param file
parameter
|
||
public class LwtaAbbreviationRepository { | ||
|
||
private final Map<String, LwtaAbbreviation> lwtaToAbbreviationObject; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove Object
at the end. Nearly everything is an object in Java.
static final String[] JOURNAL_NAMES = new String[]{"international journal", "Journal of Medicine", "journal of medicine", "journal", "Physics & geobiology"}; | ||
static final String[] ABBREVIATED_NAMES = new String[]{"int. j.", "J. Medicine", "j. medicine", "journal", "Phys. geobiol."}; | ||
|
||
@Test | ||
void abbreviateJournalNameTest() throws IOException { | ||
Path path1 = Paths.get("src", "main", "resources", "ltwa_abb.csv"); | ||
LwtaAbbreviationRepository lwtaAbbreviationRepository = new LwtaAbbreviationRepository(path1); | ||
for (int i = 0; i < JOURNAL_NAMES.length; i++) { | ||
assertEquals(lwtaAbbreviationRepository.abbreviateJournalName(JOURNAL_NAMES[i]), ABBREVIATED_NAMES[i]); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convert to @ParameterizedTest
.
(This is a slight remake of another pull request which I made from main by accident)
See issue https://github.com/koppor/jabref/issues/215:
Currently I've written the code to read the .csv file containing all lwta abbreviations, and to abbreviate a journal name. I note the following problems with what I've written:
-- it doesn't work properly on some edge cases of lwta abbreviations (occasionally, the standard requires context clues -- e.g. distinguishing between 'real' (english) and 'real' (spanish), or how '&' should be removed if it means 'and', but not in abbreviations)
-- lwta abbreviations require the elimination of articles and prepositions (in most cases), which here I've only implemented by writing a (currently very in-exhaustive) list of such.
-- I'm a little unsure of the class structure I've written for the implementation
Furthermore, it will not quite be possible to write a fully correct unabbreviate method, as the abbreviation is not quite injective.
If the above problems are deal-breakers, then I won't bother continuing to write the frontend / mvstore code (hence why this incomplete pr exists).
Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if change is visible to the user)