diff --git a/content/en/blog/contribute-open-source-project.md b/content/en/blog/contribute-open-source-project.md index 993ec738..62a99faa 100644 --- a/content/en/blog/contribute-open-source-project.md +++ b/content/en/blog/contribute-open-source-project.md @@ -10,9 +10,9 @@ image: /images/coder.png _Your guide to contributing to open source projects_ -Are you feeling intimidated by the thought of stepping into the world of open source contributions? You're not alone! Many developers, even experienced ones, hesitate to take that first step. But here's the secret: **you don't have to be a seasoned expert to make a difference.** +Are you feeling intimidated by the thought of stepping into the world of open source contributions? You don't have to be an expert to help. -Here are some tips to guide you through the process: +You might find this guide helpful: https://opensource.guide/how-to-contribute as well as the [Reddit Opensource community](https://www.reddit.com/r/opensource/). **1. Start small, think big:** @@ -47,25 +47,21 @@ Here are some tips to guide you through the process: ## How can I contribute to Harmony in particular? -The world of data science is brimming with groundbreaking possibilities, but navigating its complexities can feel daunting, especially for newcomers. Well, fear not, intrepid data explorer! Open-source projects like Harmony are here to lend a helping hand, and guess what? **You can be a part of it!** +Read our [guide to contributing to Harmony](/contributing-to-harmony/). -Harmony is a **powerful data harmonisation tool** built with love and open-source magic. It uses natural language processing (NLP) to bridge the gap between diverse research studies, automatically comparing and grouping similar items across datasets. +Harmony is a **powerful data harmonisation tool** ses natural language processing (NLP) to bridge the gap between diverse research studies, automatically comparing and grouping similar items across datasets. Here are a few ways you can get involved in the project: -But Harmony isn't just about code and algorithms; it's a vibrant community of passionate individuals like you and me. We're researchers, developers, and data enthusiasts united by a common goal: **to empower groundbreaking discoveries through accessible research data.** - -So, how can you, a curious and enthusiastic soul, contribute to this open source project? The answer is simple: **by joining Harmony!** Here are a few ways you can make your voice heard: - -**1. Code your way to data harmony:** +### 1. Get coding Harmony's codebase is built on the bedrock of Python, and contributions are always welcome. Whether you're a seasoned developer or a coding newbie, there's a place for you. You can: -* [Browse open issues and pull requests](https://github.com/harmonydata/harmony/issues): Find a challenge that sparks your interest and contribute your unique skills. +* [Browse open issues and pull requests](https://github.com/harmonydata/harmony/issues) and find a challenge that sparks your interest and contribute your unique skills. * **Help maintain the existing code:** Fix bugs, improve documentation, and suggest optimizations. * **Develop new features:** Take Harmony to the next level by proposing and implementing innovative solutions. -**2. Dive into the world of NLP:** +### 2. Work on the NLP models -The heart of the magic of Harmony is the large language models that it depends on, taken from the [Hugging Face Hub](https://huggingface.co/docs/hub/models-the-hub). If you have a knack for languages and a thirst for knowledge, you can contribute by: +The heart of the magic of Harmony is the large language models that it depends on, taken from the [Hugging Face Hub](https://huggingface.co/docs/hub/models-the-hub). You can contribute by: * **Exploring and testing NLP algorithms:** Experiment with different NLP techniques to improve Harmony's accuracy and efficiency. * **Developing new language models:** Try training your own LLM specialised in psychology. @@ -73,7 +69,7 @@ The heart of the magic of Harmony is the large language models that it depends o The deeper we understand language, the better we can harmonise the world's research data. -**3. Spread the word, be the data cheerleader:** +### 3. Publicise Harmony Harmony's mission thrives on awareness and accessibility. You can be a champion for open data by: @@ -83,5 +79,4 @@ Harmony's mission thrives on awareness and accessibility. You can be a champion **Ready to join the Harmony open source project?** Head over to our GitHub repository at [https://github.com/harmonydata/harmony](https://github.com/harmonydata/harmony), explore the free web tool at [harmonydata.ac.uk/app](https://harmonydata.ac.uk/app), and dive into our documentation. We're waiting for you with open arms (and open-source code)! -**Bonus tip:** We also have a Docker container available, making it even easier to get started with Harmony. Just check out our documentation for more details. - +**Bonus tip:** We also have a [Docker container](https://hub.docker.com/r/harmonydata/harmonywithtika) available, making it even easier to get started with Harmony. Just check out our documentation for more details. diff --git a/content/en/blog/contributing.md b/content/en/blog/contributing.md index 9e2ade33..512aff33 100644 --- a/content/en/blog/contributing.md +++ b/content/en/blog/contributing.md @@ -48,6 +48,19 @@ There are lots of ways you can contribute to Harmony! You can work on code, impr * Talk about Harmony on social media. Don't forget to tag us on Twitter [@harmony_data](https://twitter.com/harmony_data), Instagram [@harmonydata](https://www.instagram.com/harmonydata/), Facebook [@harmonydata](https://www.facebook.com/harmonydata), LinkedIn [@Harmony](https://www.linkedin.com/company/harmonydata), and YouTube [@harmonydata](https://www.youtube.com/channel/UCraLlfBr0jXwap41oQ763OQ)! * Starring and [forking](https://github.com/harmonydata/harmony/fork) Harmony on Github! +## Where do we need help in Harmony? + +In particular, the PDF extraction (converting PDFs to structured questionnaire items) is very hard and we have a separate Github repo with examples here: https://github.com/harmonydata/pdf-questionnaire-extraction + +We are planning on running a hackathon focused on this aspect of the tool. + +Also, other initiatives that could be really useful include: + +* Better handling of active vs passive voice in questionnaire items +* Allowing Harmony to switch LLMs +* Integration with other websites and tools +* An h-score: a similarity measure between instruments + ## Raising issues and the issue tracker The issue list is [in the Github repository](https://github.com/harmonydata/harmony/issues). You can view the open issues, pick one to fix, or raise your own issue. Even if you're not a coder, feel free to raise an issue.