Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor JournalAbbreviationRepository class to use on-disk maps #12606

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
17 changes: 16 additions & 1 deletion src/main/java/org/jabref/cli/JournalListMvGenerator.java
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ public static void main(String[] args) throws IOException {
compressHigh().
open()) {
MVMap<String, Abbreviation> fullToAbbreviation = store.openMap("FullToAbbreviation");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does one really need this - I think, this can be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need them to create the mv file to be reused in the JorunalAbbreviationRepository

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the whole abbreviation functions can be written without using the full object?!

MVMap<String, Abbreviation> abbreviationToAbbreviation = store.openMap("AbbreviationToAbbreviation");
MVMap<String, Abbreviation> dotlessToAbbreviation = store.openMap("DotlessToAbbreviation");
MVMap<String, Abbreviation> shortestUniqueToAbbreviation = store.openMap("ShortestUniqueToAbbreviation");

stream.forEach(Unchecked.consumer(path -> {
String fileName = path.getFileName().toString();
System.out.print("Checking ");
Expand All @@ -67,7 +71,18 @@ public static void main(String[] args) throws IOException {
}
return abbreviation2;
}));
fullToAbbreviation.putAll(abbreviationMap);

abbreviationMap.forEach((name, abbr) -> {
String abbreviationString = abbr.getAbbreviation();
String shortestUnique = abbr.getShortestUniqueAbbreviation();

Abbreviation newAbbreviation = new Abbreviation(name, abbreviationString, shortestUnique);

fullToAbbreviation.put(name, newAbbreviation);
abbreviationToAbbreviation.put(abbr.getAbbreviation(), newAbbreviation);
dotlessToAbbreviation.put(abbr.getDotlessAbbreviation(), newAbbreviation);
shortestUniqueToAbbreviation.put(abbr.getShortestUniqueAbbreviation(), newAbbreviation);
});
}
}));
}
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/jabref/logic/journals/Abbreviation.java
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ public class Abbreviation implements Comparable<Abbreviation>, Serializable {

private static final long serialVersionUID = 1;

private transient String name;
private final String name;
private final String abbreviation;
private transient String dotlessAbbreviation;
private transient final String dotlessAbbreviation;

// Is the empty string if not available
private String shortestUniqueAbbreviation;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@

import java.nio.file.Path;
import java.util.Collection;
import java.util.HashMap;
import java.util.Map;
import java.util.Objects;
import java.util.Optional;
import java.util.Set;
Expand All @@ -20,48 +18,48 @@
public class JournalAbbreviationRepository {
static final Pattern QUESTION_MARK = Pattern.compile("\\?");

private final Map<String, Abbreviation> fullToAbbreviationObject = new HashMap<>();
private final Map<String, Abbreviation> abbreviationToAbbreviationObject = new HashMap<>();
private final Map<String, Abbreviation> dotlessToAbbreviationObject = new HashMap<>();
private final Map<String, Abbreviation> shortestUniqueToAbbreviationObject = new HashMap<>();
private static final String FULL_TO_ABBREVIATION_MAP_NAME = "FullToAbbreviation";
private static final String ABBREVIATION_TO_ABBREVIATION_MAP_NAME = "AbbreviationToAbbreviation";
private static final String DOTLESS_TO_ABBREVIATION_MAP_NAME = "DotlessToAbbreviation";
private static final String SHORTEST_UNIQUE_TO_ABBREVIATION_MAP_NAME = "ShortestUniqueToAbbreviation";

private final MVStore store;
private MVMap<String, Abbreviation> fullToAbbreviationMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this map still needed? Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Since the name and dotlessAbreviation fields are transient, I use this to create the abbreviation from its name (which is the key), the abbreviation string, and the shortUniqueAbbrevition fields which can be contained from the value of this map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the abbreviation created? I think, it is only required for the UI when abbreviations are changed. When using the logic module for abbreviations, there should be no need for this object?!

In call cases, when updating an abbreviation in the UI the correct behavior needs to be (manually) tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the get function, I need to create the abbreviation again for the same reason that the abbreviation contains transient fields. If you can confirm that do we need the fields of the abbreviation to be transient or not this may help a lot. From what I understand, I see we don't need them to be transient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please investigate the usage for yourself. I have no clue about the transient - i think, this property needs to be removed if they are kept be stored in the mv store, which I doubt!

private MVMap<String, Abbreviation> abbreviationToAbbreviationMap;
private MVMap<String, Abbreviation> dotlessToAbbreviationMap;
private MVMap<String, Abbreviation> shortestUniqueToAbbreviationMap;

private final TreeSet<Abbreviation> customAbbreviations = new TreeSet<>();

/**
* Initializes the internal data based on the abbreviations found in the given MV file
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readd the javadoc

public JournalAbbreviationRepository(Path journalList) {
MVMap<String, Abbreviation> mvFullToAbbreviationObject;
try (MVStore store = new MVStore.Builder().readOnly().fileName(journalList.toAbsolutePath().toString()).open()) {
mvFullToAbbreviationObject = store.openMap("FullToAbbreviation");
mvFullToAbbreviationObject.forEach((name, abbreviation) -> {
String abbrevationString = abbreviation.getAbbreviation();
String shortestUniqueAbbreviation = abbreviation.getShortestUniqueAbbreviation();
Abbreviation newAbbreviation = new Abbreviation(
name,
abbrevationString,
shortestUniqueAbbreviation
);
fullToAbbreviationObject.put(name, newAbbreviation);
abbreviationToAbbreviationObject.put(abbrevationString, newAbbreviation);
dotlessToAbbreviationObject.put(newAbbreviation.getDotlessAbbreviation(), newAbbreviation);
shortestUniqueToAbbreviationObject.put(shortestUniqueAbbreviation, newAbbreviation);
});
}
String journalPath = journalList.toAbsolutePath().toString();
store = new MVStore.Builder().fileName(journalPath).open();

openMaps(store);
}

/**
* Initializes the repository with demonstration data. Used if no abbreviation file is found.
*/
public JournalAbbreviationRepository() {
// this will persist in memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found in the H2 database docs that when we don't provide a path to the store, the whole maps of the store will persist in the memory. So I wanted to mention this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a link to the documentation.

Even in 2025, the IDEs do not inline comments in PRs in code comments. Thus, we need to do this for ourselves.

store = new MVStore.Builder().open();

openMaps(store);

Abbreviation newAbbreviation = new Abbreviation(
"Demonstration",
"Demo",
"Dem"
);
fullToAbbreviationObject.put("Demonstration", newAbbreviation);
abbreviationToAbbreviationObject.put("Demo", newAbbreviation);
dotlessToAbbreviationObject.put("Demo", newAbbreviation);
shortestUniqueToAbbreviationObject.put("Dem", newAbbreviation);

fullToAbbreviationMap.put("Demonstration", newAbbreviation);
abbreviationToAbbreviationMap.put("Demo", newAbbreviation);
dotlessToAbbreviationMap.put("Demo", newAbbreviation);
shortestUniqueToAbbreviationMap.put("Dem", newAbbreviation);
}

private static boolean isMatched(String name, Abbreviation abbreviation) {
Expand All @@ -72,8 +70,7 @@ private static boolean isMatched(String name, Abbreviation abbreviation) {
}

private static boolean isMatchedAbbreviated(String name, Abbreviation abbreviation) {
boolean isExpanded = name.equalsIgnoreCase(abbreviation.getName());
if (isExpanded) {
if (name.equalsIgnoreCase(abbreviation.getName())) {
return false;
}
return name.equalsIgnoreCase(abbreviation.getAbbreviation())
Expand All @@ -90,11 +87,12 @@ public boolean isKnownName(String journalName) {
return false;
}
String journal = journalName.trim().replaceAll(Matcher.quoteReplacement("\\&"), "&");

return customAbbreviations.stream().anyMatch(abbreviation -> isMatched(journal, abbreviation))
|| fullToAbbreviationObject.containsKey(journal)
|| abbreviationToAbbreviationObject.containsKey(journal)
|| dotlessToAbbreviationObject.containsKey(journal)
|| shortestUniqueToAbbreviationObject.containsKey(journal);
|| fullToAbbreviationMap.containsKey(journal)
|| abbreviationToAbbreviationMap.containsKey(journal)
|| dotlessToAbbreviationMap.containsKey(journal)
|| shortestUniqueToAbbreviationMap.containsKey(journal);
}

/**
Expand All @@ -106,10 +104,11 @@ public boolean isAbbreviatedName(String journalName) {
return false;
}
String journal = journalName.trim().replaceAll(Matcher.quoteReplacement("\\&"), "&");

return customAbbreviations.stream().anyMatch(abbreviation -> isMatchedAbbreviated(journal, abbreviation))
|| abbreviationToAbbreviationObject.containsKey(journal)
|| dotlessToAbbreviationObject.containsKey(journal)
|| shortestUniqueToAbbreviationObject.containsKey(journal);
|| abbreviationToAbbreviationMap.containsKey(journal)
|| dotlessToAbbreviationMap.containsKey(journal)
|| shortestUniqueToAbbreviationMap.containsKey(journal);
}

/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wha did you remove all the javadoc comments?

Expand All @@ -128,10 +127,16 @@ public Optional<Abbreviation> get(String input) {
return customAbbreviation;
}

return Optional.ofNullable(fullToAbbreviationObject.get(journal))
.or(() -> Optional.ofNullable(abbreviationToAbbreviationObject.get(journal)))
.or(() -> Optional.ofNullable(dotlessToAbbreviationObject.get(journal)))
.or(() -> Optional.ofNullable(shortestUniqueToAbbreviationObject.get(journal)));
Abbreviation abbr = Optional.ofNullable(fullToAbbreviationMap.get(journal))
.or(() -> Optional.ofNullable(abbreviationToAbbreviationMap.get(journal)))
.or(() -> Optional.ofNullable(dotlessToAbbreviationMap.get(journal)))
.or(() -> Optional.ofNullable(shortestUniqueToAbbreviationMap.get(journal)))
.orElse(null);
if (abbr != null) {
// Recreate the Abbreviation so that the dotless field is derived from the abbreviation.
abbr = new Abbreviation(abbr.getName(), abbr.getAbbreviation(), abbr.getShortestUniqueAbbreviation());
}
return Optional.ofNullable(abbr);
}

public void addCustomAbbreviation(Abbreviation abbreviation) {
Expand Down Expand Up @@ -168,10 +173,17 @@ public Optional<String> getShortestUniqueAbbreviation(String text) {
}

public Set<String> getFullNames() {
return fullToAbbreviationObject.keySet();
return fullToAbbreviationMap.keySet();
}

public Collection<Abbreviation> getAllLoaded() {
return fullToAbbreviationObject.values();
return fullToAbbreviationMap.values();
}

private void openMaps(MVStore store) {
fullToAbbreviationMap = store.openMap(FULL_TO_ABBREVIATION_MAP_NAME);
abbreviationToAbbreviationMap = store.openMap(ABBREVIATION_TO_ABBREVIATION_MAP_NAME);
dotlessToAbbreviationMap = store.openMap(DOTLESS_TO_ABBREVIATION_MAP_NAME);
shortestUniqueToAbbreviationMap = store.openMap(SHORTEST_UNIQUE_TO_ABBREVIATION_MAP_NAME);
}
}
Loading