diff --git a/modules/researchsoftware/exercise-is-it-research-software.md b/modules/researchsoftware/exercise-is-it-research-software.md index c94df799..46312237 100644 --- a/modules/researchsoftware/exercise-is-it-research-software.md +++ b/modules/researchsoftware/exercise-is-it-research-software.md @@ -1,5 +1,5 @@ --- -title: Research software? +title: Is it Research Software? type: exercise order: 5 --- diff --git a/modules/researchsoftware/exercise-research-life-cycle.md b/modules/researchsoftware/exercise-research-life-cycle.md index c8dda175..8231f57f 100644 --- a/modules/researchsoftware/exercise-research-life-cycle.md +++ b/modules/researchsoftware/exercise-research-life-cycle.md @@ -1,7 +1,7 @@ --- -title: Research life cycle +title: Research Life Cycle type: exercise -order: 6 +order: 5 --- ## Exercise: Where does software fit in the research life cycle? diff --git a/modules/researchsoftware/exercise-writing-software.md b/modules/researchsoftware/exercise-writing-software.md new file mode 100644 index 00000000..1e917cea --- /dev/null +++ b/modules/researchsoftware/exercise-writing-software.md @@ -0,0 +1,25 @@ +--- +title: Writing Software +type: exercise +order: 2 +--- + +## Writing software with Python (10 minutes) + + +https://www.online-python.com/ + +### If you have never written software before: + +- Visit https://www.online-python.com/ + - it will show you as an example the code for a function that adds items together. +- Play around with the code a bit, e.g. + - Try using words instead of numbers as inputs. What happens and why? + - Add a function that subtracts items rather than adding them. What happens now if you use words instead of numbers, and why? + - What else can you do? + +### If you have experience in writing software: + +- Buddy up with someone from the above group. +- Use the 4-eye principle (pair programming), to watch what they are doing and help them figure out what is going on and why. +- Try to be conscious in explaining what is important and what your buddy needs to know without overcomplicating matters or taking over the keyboard. diff --git a/modules/researchsoftware/software-and-data.md b/modules/researchsoftware/further-reading.md similarity index 50% rename from modules/researchsoftware/software-and-data.md rename to modules/researchsoftware/further-reading.md index 61fed1f0..44d6ec3e 100644 --- a/modules/researchsoftware/software-and-data.md +++ b/modules/researchsoftware/further-reading.md @@ -1,17 +1,19 @@ --- -title: Software and data +title: Further Reading type: reading -order: 2 +order: 6 --- -### Software is data (10 minutes, optional) +# Software and data + +## Software is data + In computer science, the fact that software is data is considered one of the fundamental concepts of computing. The fact that the thing that operates the machine (software) is the same kind of thing as the thing it operates on (data) is definitely one of the strengths of current computersystems and one of the main reasons why we can do such complex and powerful things with the combination of hardware and software. -Read the following blogpost about why this concept is so powerful: -https://www.blackliszt.com/2014/04/fundamental-concepts-of-computing-software-is-data.html +The following blogpost discusses why this concept is so powerful: [Fundamental Concepts of Computing: Software is Data!](https://www.blackliszt.com/2014/04/fundamental-concepts-of-computing-software-is-data.html) +## Software is a special type of data -### Software is a special type of data (10 minutes, optional) However for most practical purposes in most domains of scientific research (except maybe the domain of computer science) it is useful to make a distinction between the data that is software and other data. From now on, when we use the word data, we mean the kind fo data which is not software and which we use to store more static information. - Software is executable, data is not. @@ -19,12 +21,17 @@ However for most practical purposes in most domains of scientific research (exce - Software is a creative work, scientific data are facts or observations. - The lifetime of software is generally not as long as that of data. +[Software vs Data](https://github.com/danielskatz/software-vs-data) + +[Software vs. data in the context of citation](https://doi.org/10.7287/peerj.preprints.2630v1) + -https://github.com/danielskatz/software-vs-data +# The role of Research Software -https://doi.org/10.7287/peerj.preprints.2630v1 +The following piece was written after a workshop called "The Future of Research Software", held in the Netherlands in 2022. +It explores the different roles for research software in the research life cycle, strengthening the case for sustainable software. -### Discussion +[Defining the Roles of Research Software](https://upstream.force11.org/defining-the-roles-of-research-software/) -- Can you think of examples where the line between software and data becomes fuzzy? +[Defining Research Software: a controversial discussion](https://zenodo.org/records/5504016): Summary Report of FAIR4RS Subgroup 3 activity and discussion diff --git a/modules/researchsoftware/info.md b/modules/researchsoftware/info.md index 6d619d91..ad311a3a 100644 --- a/modules/researchsoftware/info.md +++ b/modules/researchsoftware/info.md @@ -1,5 +1,5 @@ --- -title: Learning objectives +title: Learning Objectives type: info order: 0 --- diff --git a/modules/researchsoftware/media/FAIR-software-paper.png b/modules/researchsoftware/media/FAIR-software-paper.png new file mode 100644 index 00000000..9eb40c9a Binary files /dev/null and b/modules/researchsoftware/media/FAIR-software-paper.png differ diff --git a/modules/researchsoftware/media/VScode_snapshot.png b/modules/researchsoftware/media/VScode_snapshot.png new file mode 100644 index 00000000..9d4cfc2c Binary files /dev/null and b/modules/researchsoftware/media/VScode_snapshot.png differ diff --git a/modules/researchsoftware/media/data_definition.png b/modules/researchsoftware/media/data_definition.png new file mode 100644 index 00000000..8d52f997 Binary files /dev/null and b/modules/researchsoftware/media/data_definition.png differ diff --git a/modules/researchsoftware/media/modern-software.avif b/modules/researchsoftware/media/modern-software.avif new file mode 100644 index 00000000..f7e1e551 Binary files /dev/null and b/modules/researchsoftware/media/modern-software.avif differ diff --git a/modules/researchsoftware/media/python-online.png b/modules/researchsoftware/media/python-online.png new file mode 100644 index 00000000..51e5800e Binary files /dev/null and b/modules/researchsoftware/media/python-online.png differ diff --git a/modules/researchsoftware/media/research-cycle-RS.png b/modules/researchsoftware/media/research-cycle-RS.png new file mode 100644 index 00000000..fd6a52ba Binary files /dev/null and b/modules/researchsoftware/media/research-cycle-RS.png differ diff --git a/modules/researchsoftware/research-software.md b/modules/researchsoftware/research-software.md deleted file mode 100644 index 2b6c170a..00000000 --- a/modules/researchsoftware/research-software.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -title: Research software -type: reading -order: 4 ---- - -## The role of research software (10 minutes) - -The following piece was written after a workshop called "The Future of Research Software", held in the Netherlands in 2022. - -It explores the different roles for research software in the research life cycle, strengthening the case for sustainable software. - -[Defining the Roles of Research Software](https://upstream.force11.org/defining-the-roles-of-research-software/) - -After reading the piece, discuss the following questions: - -- What are the different roles of research software? -- What are the challenges for each of these roles? -- How can we address these challenges? diff --git a/modules/researchsoftware/slides-researchsoftware.md b/modules/researchsoftware/slides-researchsoftware.md index 857fc82f..cb1218eb 100644 --- a/modules/researchsoftware/slides-researchsoftware.md +++ b/modules/researchsoftware/slides-researchsoftware.md @@ -14,9 +14,7 @@ order: 3 ## *Ceci n'est pas une photo* -
-
[The Event Horizon Telescope Collaboration et al. 2019](https://doi.org/10.3847/2041-8213/ab0ec7) (CC BY 3.0) @@ -40,11 +38,9 @@ The data can be converted into an image using custom software. ## The research lifecycle -
The Research Life Cycle -
-Software is used all across the research lifecycle +In which stages is software used? Note: @@ -54,9 +50,9 @@ Software is used all across the research life cycle. -
- -
+### What is Research Software? + + definition of Research Software from the FAIR4RS working group @@ -71,19 +67,59 @@ The code written in R or Python for an analysis would be research software, howe Just like a custom-made Excel macro that is used to analyse data. Or a custom-made web application that is used to collect data. +== + + + +## The research lifecycle + +In which stages is ***Research Software*** used? + +The Research Life Cycle + + +Note: + +Research Software is *mainly* used in "Collecting" and "Processing & analyzing" steps. +However, non-research software can also be used in these steps, and research software can also be used in other steps. + +== + + + +## The research lifecycle + +In which stages is ***Research Software*** used? + +The Research Life Cycle + + +Note: + +Research Software is *mainly* used in "Collecting" and "Processing & analyzing" steps. +However, non-research software can also be used in these steps, and research software can also be used in other steps. + === ## Why the distinction? -- Research software is an important asset and output of research -- Enable proper attribution -- Increase Findability and Reuse +Defining Research Software... + +- Acknowledges its importance during research +- Designates it as research output +- Enables proper attribution +- Facilitates findability and reusability Note: -The distinction is important +By defining research software: + +- we can more easily justify and emphasize how essential it is while doing research +- we can emphasize that creating (good) research software is an essential part of research and that the product is a true "deliverable" e.g. in grant applications +- we give the opportunity for the developers to gain (citable) recognition for their work +- by tagging software as "research software", we make it more (computer) findable, and therefor more easily reused === @@ -92,8 +128,8 @@ The distinction is important ## Take home messages - Software is an important part of research -- Not all software used in research is research software -- It is important to regard research software in the whole research lifecycle +- Not all software used in research is Research Software +- Defining "Research Software" provides recognzition in the research community === diff --git a/modules/researchsoftware/slides-software.md b/modules/researchsoftware/slides-software.md index 795be0f5..f7d337e0 100644 --- a/modules/researchsoftware/slides-software.md +++ b/modules/researchsoftware/slides-software.md @@ -1,5 +1,5 @@ --- -title: What is software? +title: What is Software? type: slides order: 1 --- @@ -10,12 +10,12 @@ order: 1 === - + ## Some history -
+ -
+ Photo by Stefan Kuhn on Wikimedia @@ -27,12 +27,12 @@ See next slide for explanation. == - + ## Some history -
+ -
+ Photo by Stefan Kuhn on Wikimedia @@ -42,29 +42,26 @@ A street organ's machinery is instructed by long "books" of cardboard with holes == - + ## Some history -
- -
+ + https://youtu.be/wbLuMd5zYww?si=3o0zptLY4c3i1ppk&t=275 Note: -A long book of cardboard with holes punched in it is used to give the barrel organ instructions on which mechanical instruments should play what note at which time. +A long book of cardboard with holes punched in it is used to give the street organ instructions on which mechanical instruments should play what note at which time. === - + -## Some history +## Some more history -
-
Photo by Rainer Gerhards on Wikimedia @@ -76,13 +73,11 @@ Who knows what this is? Looks similar to the previous thing, right? This is soft == - + -## Some history +## Some more history -
- -
+ https://youtu.be/kaQmAybWn-w?si=zRmBx4Df68gWuw3e&t=540 @@ -92,13 +87,11 @@ This software was written using special typewriters that would punch holes in ca == - + -## Some history +## Some more history -
- -
+ https://youtu.be/SYpPPIsxq64?si=m__szsXBDI6SP5kx&t=793 @@ -108,58 +101,81 @@ These punchcards would be loaded into the computer to instruct which parts of th === - + + +## Software today + + +Photo by ThisisEngineering on Unsplash -## Software now -
- -
-Photo by Chris Ried on Unsplash - Note: -These days, software usually gets written using a computer, in a text editor program, so the act of writing software requires software itself. The software is stored in memory and on a hard disk rather than on cardboard (remember, one card per line...), but it basically still does the same: Software is a set of instructions that tells hardware what to do. +Q: In what way is modern software different from historic punch-card "software" + +Some possible answers: +These days, software usually gets written using a computer, in a text editor program, so the act of writing software requires software itself. The software is stored in memory and on a hard disk rather than on cardboard (remember, one card per line...). Modern software often builds on other software, or used specific parts of other software packages, rather than re-defining the wheel. + +However, it ultimately still basically does the same: Software is a set of instructions that tells hardware what to do. == - + + +## Complexity of modern software + + +Screenshot of source code for DeepRank2 Note: -On this website you can experiment with writing software (python code) yourself: -https://www.online-python.com +In this image, we can get a glimpse of the complexity of modern software. We can see that software has a certain structure (indicated by differently colored text), is often composed of many lines of code (right side of image shows the entire length of the code), and is split over multiple intercommunicating files (left). === - + + +## Software is a form of data -## Software is like other data + +Definition by Merriam Webster English Dictionary -Software is ... +== + + -- stored as bits -- read, loaded and processed -- can be input, and can be output +## Software is data +Software can be ... + +- stored as bits +- read, loaded, and processed +- input and/or output Note: -Software is stored as bits and read from disk, loaded into memory and processed in exactly the same way as other data. Software can be input, and software can be output. In fact, one of the major breakthroughs in computer science was when people realized that the instructions of the machine could be handled and stored the same as the data that it operated on. +- Software is stored as bits and read from disk, loaded into memory, and processed similar to other types of data. +- Software can be input, and software can be output. In fact, one of the major breakthroughs in computer science was when people realized that the instructions of the machine could be handled and stored the same as the data that it operated on. +- And fits all 3 dictionary definitions of data quite well. +== -=== + + +## BUT + +== -## Software is not 'just' data +## Software is a special type of data Software is... -- **complex**: code is creatively generated, interconnected and multi-layered -- **interdependent**: it builds upon and therefore depends on other software -- **executable**: it is not static, but can be run (to process data) -- **dynamic**: it can (and will) break soon, needs to be updated +- **complex**: it is creatively generated, interconnected, and multi-layered +- **interdependent**: it builds upon and therefore depends on other software (and data) +- **executable**: it needs to be run to have a value (e.g. to process data) +- **dynamic**: it can (and will) break soon and therefore needs to be regularly updated Note: @@ -170,9 +186,34 @@ Software and data both are digital objects, sharing certain characteristics: the Software is quite different from data, however. Consider: - Complexity; it is not a single file, but a collection of files that are interconnected and multi-layered, and do not necessarily stand on their own. Software is also the result of a creative process that provides a tool to do something, and not the result of a measurement or observation. -- Interdependence; software is often built using other software, and rarely built completely from scratch. This makes it dependent on other existing applications, which themselves may also change over time. -- Executability; software is in its dryest form a set of instructions that can be an archive of a procedure. However, the main goal of software is that these instructions can be executed. Data, by contrast, stand on their own. -- Dynamic vs static; its interdependence and context-dependency drives software to require maintenance to retain its value, and this maintenance is not straightforward. Maintenance is also counter to academic culture; it does not fit in existing structures (both in terms of reward/recognition, but also in terms of funding and understanding of what is needed). Versioning of software is very common, while data is often static (though versions may happen). +- Interdependence; software is often built using other software, and rarely built completely from scratch. This makes it dependent on other existing applications, which themselves may also change over time. In the context of research software it also often depends on the data, data formats, data standards, metadata, etc, which also change when new equipment becomes available. +- Executability; software is in its dryest form a set of instructions that can be an archive of a procedure. However, the main goal of software is that these instructions can be executed. While most forms of data can stand on their own (e.g. I can look at the list of ages of all the people in a room and make conclusions about average age, etc.), software cannot be directly interpreted (only in the context of what it does). +- Dynamic vs static; its interdependence and context-dependency drives software to require maintenance to retain its value, and this maintenance is not straightforward. Versioning of software is very common, while data is often static (though versions may happen). + +== + + + +## Software requires dedicated maintenance + +- Software needs to be actively maintained to remain useable. + - This is contrary to 'regular' data, which is expected to remain static forever. +- Maintenance is often lacking in academic contexts + - lack of funding + - lack of incentives/rewards + - lack of understanding and expertise (maintenance is a different from creation) + - individualistic work with fast turnover + + + +Note: +'Regular' data can sometimes be versioned, but this is more the exception than the rule. + +Regarding the reasons for lack of maintenance: +- funding opportunities for this do exist, but are rare and usually only for large projects +- while you can sometimes publish a major update to a software package, this is also rare, and will even more rarely result in a high-impact/highly cited paper. it is also not highly appreciated e.g. on a CV. +- while academics can often create software that "does the job" well, a different skillset is required to maintain the software (or write it in a way to facilitate maintenance) +- academics often work on projects by themselves and then move on soon after their papers are published, leaving a knowledge gap for others to maintain the software === @@ -180,10 +221,18 @@ Software is quite different from data, however. Consider: ## Software vs data management -- Software is a **living thing** -- Some FAIR data management practices apply to software -- Many FAIR principles do not apply easily to software -- Good data management will not ensure good software management + + +FAIR Software Paper Note: The different nature of software also provides opportunities but also requires extra thought on its management. @@ -204,6 +253,7 @@ Consider also version control software, a good software development practice tha - Software is a 'living thing' - We need extra and different techniques for good software management + ===