GSoC 2025 ideas list

JabRef in Google Summer of Code 2025

JabRef is a powerful, open-source, cross-platform citation and reference management tool designed to help researchers stay organized and efficient. With JabRef, you can effortlessly collect, organize, and manage your literature sources, giving you more time to focus on what truly matters: your research.

By contributing to JabRef, you contribute to advancing global research. Trusted by over 10,000 researchers worldwide, JabRef plays a vital role in shaping the future of academic and scientific discovery. Your skills and creativity can help push the boundaries of what JabRef can achieve.

Built in Java, JabRef is designed with a strong emphasis on high-quality, modern, and maintainable code. As a contributor, you’ll have the opportunity to enhance your technical skills, deepen your understanding of Java development, and learn best practices in open source collaboration. Whether you're a beginner or an experienced developer, working on JabRef will help you grow as a programmer while making a meaningful impact on a tool that supports researchers around the globe.

We are passionate about open source and pride ourselves on fostering collaboration within a diverse and inclusive community. JabRef is dedicated to providing a welcoming environment for newcomers to open source, making it an ideal starting point for anyone eager to contribute. With four successful years of Google Summer of Code (GSoC) participation, we’ve achieved significant milestones in enhancing JabRef as a user-friendly research tool. Each project has been a meaningful step toward empowering researchers worldwide. As a GSoC participant with JabRef, you'll have the opportunity to grow your technical skills, coding expertise, and open source experience. Beyond the invaluable learning, participants receive a stipend from Google and gain access to a global professional network that can open doors for their future.

Below, you’ll find some project ideas to inspire your contributions to JabRef through GSoC. We’ve also included links to provide more background information and context.

Links

What is Google Summer of Code?
GSoC timeline
- latest proposal deadline: TBA
- coding until: TBA 18:00 UTC (can be extended under conditions)
GSoC stipends: starting at 750 USD, depending on the country.
Google's guide on making first contact
Checklist for items contained in the proposal
Google's guide on wrting a good proposal

(All summarized information is tentative. The definitive information is on the linked pages.)

Projects

This page lists a number of ideas for potential projects to be carried out by the persons participating in Google Summer of Code 2025. This is by no means a closed list, so the possible contributors can feel free to propose alternative activities related to the project (the list of feature requests and the GitHub issue tracker might serve as an additional source of inspiration). Students are strongly encouraged to discuss their ideas with the developers and the community to improve their proposal until submission (e.g., using the Gitter Channel or the forum). It's also a good idea to start working on one of the smaller issues to make yourself familiar with the contribution process. Successful pull requests increase the chance of being accepted as mentee.

Improved Journal Abbrevations

Currently, JabRef has a single list of journal abbreviations. This list is a combined list of the .csv files at https://github.com/JabRef/abbrv.jabref.org/tree/main/journals. Instead of the dropdown of JabRef should not show a single "JabRef built in list", but should show the various lists we offer: build-in lists, external lists, custom lists. Then, one can enable and disable with a click. This eases the users to find issues in abbreviations lists and allows allows users to customize the lists according to their field (e.g., physics, information science, ...).

Full JabRef issue: https://github.com/JabRef/jabref/issues/12364

Skills required:

Java, JavaFX

Possible Mentors:

@calixtus, @koppor

Project size:

90h (small)

Welcome Walkthrough

This project aims to create an engaging and informative first start screen for JabRef, enhancing the initial user experience and showcasing the best features of the software. This screen will differ from the standard interface displayed when no database is open, providing a tailored introduction for new users.

Hints

Configuration of Paper Directory: - Implement a feature allowing users to easily set up and manage their paper directory, as detailed in Issue #41.
Integration of Online Services: - Include options for update checks, connecting with online services like Grobid (referencing Issue #566), fetchers, and full-text search capabilities.
- Incorporate telemetry features with a clear and concise privacy statement.
Creation of Example Library: - Develop a feature to create an example library, helping new users quickly understand JabRef's functionality.
Community Engagement Tools: - Add links to the JabRef forum for support and Mastodon for community interaction.
Donation Prompt:- Encourage support for JabRef through a tastefully integrated donation option.
User Group-Specific Defaults: - Offer pre-configured default preferences catering to different user groups, such as "relaxed users" wanting all features, and "pro-users" who prefer managing BibTeX files without additional features (as per Issue #9491).

(These are just ideas, during the project, this needs to be refined)

Expected Outcome:

A welcome dialog with nice and welcoming UX

Examples:

The welcome dialog should ask for: Configuration of Paper Direction, Integration of Online Services (Grobid, Telemetry), Creation of Example Library, Community Engagement Tool, Link to Donation page
The welcome dialog should offer some sensitive User Group-Specific Defaults: Offer pre-configured default preferences catering to different user groups, such as "relaxed users" wanting all features, and "pro-users" who prefer managing BibTeX files without additional features (as per Issue #9491).

Skills required:

Java, JavaFX

Possible Mentors:

@koppor, @tobiasdiez

Project size:

175h (medium)

Using PostgreSQL as full backend for JabRef

Currently, JabRef holds all entries in memory. It even converts LaTeX to unicode and vice versa to support better search. While this is a great UX, this leads to a huge memory consumption. The more "proper" way is to use a database (such as PostgreSQL) to store the entries. Then, not all entries need to be loaded in memory. The first step is to introduce a data-access layer: The maintable should read from SQL database, not from all in-memory. Possible future work may be: https://www.zotero.org/support/dev/client_coding/direct_sqlite_database_access

There can be an initial phase to evluate whether PostgreSQL is the right DBMS as backend for JabRef. For instance, DuckDB and SQLite were also discussed. Currently, PostgeSQL turned out best (especially for handling regular expression search on the database itself), but things may have changed in 2025.

Internal note: This is issue https://github.com/JabRef/jabref/issues/10209

Skills required:

PostgreSQL, Java, JavaFX

Possible Mentors:

@koppor, @InAnYan, @calixtus

Project size:

175h (medium)

Improved LibreOffice-JabRef integration

Description:

JabRef can connect to LibreOffice to offer premier reference management by allowing users to cite library entries directly into the document, and then generate bibliographies based on the cited entries. See JabRef LibreOffice Integration.

We have a collection of independent projects available for the LibreOffice/OpenOffice integration feature of JabRef.

Currently, custom styles (JStyles) and CSL styles are supported. In the LaTeX-world, BST styles (specified via .bst files) are still popular. JabRef already has BST support, but it is currently not accessible via the UI.
- Expected deliverable: It should be possible to select a .bst file, which is then used for rendering into the LibreOffice document. [Details: #624]
JabRef in LibreOffice should support auto-updation of references when switching from CSL-based formats to JStyle (or BST)-based formats and back. Currently, if the user messes up and realizes that they had to use another style family, the workaround is to re-cite all entries again with the new style, then refresh bibliography. This may not be very user-friendly when citation styles need to be updated when submitting papers to different journals (one use-case), or simply because of last minute change in decisions. For this project, the starting step will be unifying the "reference mark" (document annotation) format for all these style types, so that the entry information can be parsed across styles. This project thus goes very well coupled with Project 1.
- Expected deliverable: On changing style type (CSL/BST/JStyle), all references in the documents should seamlessly adapt to the new style.
In case of CSL styles, reference management software like Zotero and Mendeley can read each other's citations in LibreOffice. This is made possible by following a specific format of document annotations, embedding information in CSL JSON. In JabRef, the internal format of references is currently a JabRef-custom format. It should be changed to a format used by Zotero, so that cross-compatibility can be ensured. See the discussion at https://github.com/JabRef/jabref/issues/2146#issuecomment-891432507 for details. This includes: i) Implementation of that format, ii) Implementation of a converter from the "old" JabRef-Format to the new one. The converter could be implemented within OpenOffice (similar to JabRef_LibreOffice_Converter).
- Expected deliverable: One can seamlessly switch working with LibreOffice documents having citations from Zotero and JabRef.

Skills required:

Java, JavaFX

Possible Mentors:

@Siedlerchr, @subhramit

Project size:

350h (large): If (Project 1 + Project 2 + Project 3)
175h (medium): If (Project 1 + Project 2) OR (Project 1 + Project 3)
90h (small): If Project 1 OR Project 3

Improve handling of ancient documents by OCR and AI

JabRef, a comprehensive literature management software, currently supports both handling metadata and text-based PDF documents. However, a significant limitation arises with scanned PDFs, particularly historical articles, which are not text-searchable due to their image-based format. This project aims to bridge this gap by integrating advanced OCR (Optical Character Recognition) technology, enabling full-text search in scanned PDFs.

Useful links:

A Document AI Package: https://github.com/deepdoctection/deepdoctection
Hand-written text recognition in historical documents: https://github.com/githubharald/SimpleHTR#handwritten-text-recognition-with-tensorflow
Java OCR with Tesseract: Baeldung Guide
OCRmyPDF Installation and Usage: GitHub Repository
ChatOCR and ChatGPT Integration: Blog Article
AI-Powered OCR: Addepto Blog
Tika OCR Integration: Apache Tika Wiki
Tesseract OCR Library: Official Documentation
Surya AI powered SOTA OCR, better than Tesseract but coded in python https://github.com/VikParuchuri/surya

Some aspects:

Add an option to call an OCR engine from JabRef, e.g., cloud based or local installs
Define a common interface to support multiple OCR engines
Provide a good default set of settings for the OCR engines
Support expert configuration of the settings
Add the extracted text as a layer to the pdf so that Apache Lucene can parse it
Add an option to further process the text with Grobid for training and metadata extraction

Expected outcome:

A) Develop a common interface within JabRef to accommodate multiple OCR engines, ensuring flexibility and expandability. B) Enable expert users to fine-tune OCR settings, catering to specific needs or document formats.
C) Incorporate the OCR-extracted text as a searchable layer in PDFs, allowing Apache Lucene to index and look for the content.

Skills required:

Proficiency in Java programming.
A keen interest and curiosity in document processing and AI technologies.

Possible mentors:

@Siedlerchr, @InAnYan, @calixtus

Project size:

175h (medium)

Improved SLR Support

Description:

With the ever-growing number of publications in computer science and other fields of research, conducting secondary studies becomes necessary to summarize the current state of the art. For software engineering research, Kitchenham popularized the systematic literature review (SLR) method to address this issue. The main idea is to systematically identify and analyze the majority of relevant publications on a specific topic. This is usually an activity that takes extensive manual effort. Some tool support does exist, but the full potential of tools has not been exploited yet. JabRef also offers basic functionality for systematic literature reviews that is used by a number of researchers to systematically "harvest" related work based on the fetching capabilities of JabRef. While using the feature, various additional feature requests came up. For instance, created search queries are currently transformed internally by JabRef to the query format of the publisher. It should also be possible to directly input a query at the publisher site, e.g., for IEEE or ACM. More information: Dominik Voigt, Oliver Kopp, Karoline Wild: Systematic Literature Tools: Are we there yet? ZEUS 2021: 83-88

One key aspect would be the improvement of the fetcher Infrastructure in JabRef to better adapt to new and changing Publisher/Journal websites and to offer a more direct integration. As an inspiration, see BibDesk.

Expected outcome:

An advanced SLR functionality, where a researcher is supported to execute a systematic-literature-review.

We did an initial project organization at https://github.com/users/koppor/projects/2.

Skills required:

Java, JavaFX

Possible mentors:

@koppor, @calixtus

Project size:

175h (medium)

{Your own project}

You can propose another projects. JabRef offers a variety of places where it can be improved. Think as user or talk to other users. Following places are a good start:

Feature requests prioritized: https://github.com/orgs/JabRef/projects/6
General list of feature requests: http://discourse.jabref.org/c/features
Candidates of university projects, the large ones: https://github.com/orgs/JabRef/projects/3/views/3?filterQuery=status%3A%22free+to+take%22+size-of-project%3Alarge&sortedBy%5Bdirection%5D=desc&sortedBy%5BcolumnId%5D=8246261

Home
General Information
Development
- Please go to our devdocs at https://devdocs.jabref.org
GSOC 2025 ideas list
Completed "Google Summer of Code" (GSoC) projects
GSoC Archive
- GSOC 2024 ideas list
- GSoC 2022 - Apache Lucene Search
Release
- Releasing a new version
- Information update after a release
JabCon Archive
- JabCon 2021
- JabCon 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC 2025 ideas list

JabRef in Google Summer of Code 2025

Links

Projects

Improved Journal Abbrevations

Welcome Walkthrough

Using PostgreSQL as full backend for JabRef

Improved LibreOffice-JabRef integration

Improve handling of ancient documents by OCR and AI

Improved SLR Support

{Your own project}

Clone this wiki locally