From a4aea1285a4212e34fa3fabce1922c91d867d2e4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Wed, 22 Feb 2023 18:10:11 +0100 Subject: [PATCH 01/35] draft of first two subsections --- thesis-en.tex | 101 ++++++++++++++++++++++++++++++-------------------- 1 file changed, 60 insertions(+), 41 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index 8dc722ec..aef00656 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -319,56 +319,71 @@ \chapter{State of the art}\label{r:chapter_stateoftheart} \section{Problems with using semver in Rust}\label{r:section_usageofsemver} -TODO: -\begin{itemize} - \item explain why it is easy to break semver in Rust. - Do that by giving specific, non-obvious code examples. - \item search for sources from which to get examples - \item explain other reasons as to why people tend to break semver - \item don't give yet real-life examples (those will be in the sections under), - write in a general way - \item make it clear that using semver in Rust is hard -\end{itemize} +% TODO: +% - explain why it is easy to break semver in Rust. +% Do that by giving specific, non-obvious code examples. +% - search for sources from which to get examples +% - explain other reasons as to why people tend to break semver +% - don't give yet real-life examples (those will be in the sections under), +% write in a general way +% - make it clear that using semver in Rust is hard + +It might seem easy to maintain semver, but some violations are really hard to notice, when not actively searching for them. Let's look at an example. + +\begin{verbatim} +struct Foo { + x: String +} + +pub struct Bar { + y: Foo +} +\end{verbatim} + +Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} causes semver break, even though it is a non-public field of a non-public struct. Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits, that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc} implements neither of them, so the change results in publicly visible struct {\ttfamily Bar} losing a trait. + +Of course, things can get way more complex. Just for example, having these structs in very different locations complicates keeping track of such behaviours. It should be clear by now, that breaking semver on accident is possible. \section{Consequences of breaking semver} -TODO: -\begin{itemize} - \item describe that breaking semver means that people's code stops compiling - \item describe the possible scale of catastrophes - \item don't give yet real-life examples, write in a general way -\end{itemize} +% TODO: +% - describe that breaking semver means that people's code stops compiling +% - describe the possible scale of catastrophes +% - don't give yet real-life examples, write in a general way + +When you publish a new version of a crate, that is breaking semver, you are causing a major inconvenience for the crate's users. +Their code might just stop compiling, when the offending version gets downloaded. + +This also could happen if the crate containing violation is not an immediate dependency, so one semver break, could result in tons of broken crates. + +Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be really frustrating, and might drive the users to stop using your crate. \section{Real-life examples of semver breaks} -TODO: -\begin{itemize} - \item write (and cite) about cases our mentor mentioned in his blogs - \item write about cases users reported in the github issue - \item mention the paper describing that 43\% of yanked releases - are because of semver breaks and 3.7\% of all >300'000 releases are yanked - \item mention that we've developed - a script that scans all releases for the semver breaks - we can detect and the results are presented in some chapter -\end{itemize} +% TODO: +% - write (and cite) about cases our mentor mentioned in his blogs +% - write about cases users reported in the github issue +% - mention the paper describing that 43\% of yanked releases +% are because of semver breaks and 3.7\% of all >300'000 releases are yanked +% - mention that we've developed +% a script that scans all releases for the semver breaks +% we can detect and the results are presented in some chapter \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} -TODO: -\begin{itemize} - \item list languages which have semver checking built-in, - explain that the language semantics were made for e.g. semver checking, - \item list current tools for detecting semver breaks in Rust: - cargo-breaking, rust-semverver, cargo-semver-checks - \item for the first two, explain a bit how they work and why they are no longer maintained. - Mieszko's slides have some info about that. - \item for cargo-semver-checks, explain a bit how it works (rustdoc, json, etc.) - and that contrary to the other two, it is maintained and it's made to be easily maintained. - Mention that this is the project we're working on. - \item research the current state of semver detection in other languages, - explain that it's hard to do in popular languages, - especially without features like rustdoc. -\end{itemize} +% TODO: +% - list languages which have semver checking built-in, +% explain that the language semantics were made for e.g. semver checking, +% - list current tools for detecting semver breaks in Rust: +% cargo-breaking, rust-semverver, cargo-semver-checks +% - for the first two, explain a bit how they work and why they are no longer maintained. +% Mieszko's slides have some info about that. +% - for cargo-semver-checks, explain a bit how it works (rustdoc, json, etc.) +% and that contrary to the other two, it is maintained and it's made to be easily maintained. +% Mention that this is the project we're working on. +% - research the current state of semver detection in other languages, +% explain that it's hard to do in popular languages, +% especially without features like rustdoc. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Vision % @@ -703,6 +718,10 @@ \section{Responsibilities} \textit{Semantic Versioning 2.0.0} (2022) \\ https://semver.org/ +\bibitem[1]{beaman} Predrag Gruevski, + \textit{Towards fearless cargo update} (2022) \\ + https://predr.ag/blog/toward-fearless-cargo-update/ + \end{thebibliography} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% From 4ab20d7d5aca4c1e0d40ff09b4415f5778298b99 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Sat, 25 Feb 2023 18:16:40 +0100 Subject: [PATCH 02/35] existing tools subsection MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index aef00656..8cf9d05a 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -342,7 +342,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} causes semver break, even though it is a non-public field of a non-public struct. Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits, that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc} implements neither of them, so the change results in publicly visible struct {\ttfamily Bar} losing a trait. -Of course, things can get way more complex. Just for example, having these structs in very different locations complicates keeping track of such behaviours. It should be clear by now, that breaking semver on accident is possible. +Of course, things can get way more complex. Just for example, having these structs in very different locations complicates keeping track of such behaviours. Section 2.3 will also show that these conditions result in real-life unintended semver breaks. It should be clear by now, that breaking semver on accident is possible. \section{Consequences of breaking semver} @@ -353,7 +353,6 @@ \section{Consequences of breaking semver} When you publish a new version of a crate, that is breaking semver, you are causing a major inconvenience for the crate's users. Their code might just stop compiling, when the offending version gets downloaded. - This also could happen if the crate containing violation is not an immediate dependency, so one semver break, could result in tons of broken crates. Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be really frustrating, and might drive the users to stop using your crate. @@ -385,6 +384,14 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se % explain that it's hard to do in popular languages, % especially without features like rustdoc. +There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge enforces semantic versioning. It's type system enables automatic detection of all API changes. + +Unfortunately, the Rust langugage's semantic were not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, cargo-breaking, works on the abstract syntax tree. The problem here is that to compare API changes, you must navigate two trees at once, which can get really complex and tedious, because the abstract syntax tree could change quite a lot without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree might change along with the development of the language, which makes maintenance time-consuming. + +The second existing tool is rust-semverver, which focuses on the metadata present in the rust-specific rlib binary dynamic static library format. Because of that, unfortunately, the user experience is far from ideal, as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited. + +In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. Adding new queries is designed to be quite accessible, and the maintaince comes to keeping adapter up to date with rustdoc changes, which seems to be about as low effort as it could be. + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Vision % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -718,10 +725,13 @@ \section{Responsibilities} \textit{Semantic Versioning 2.0.0} (2022) \\ https://semver.org/ +% State of the art references: \bibitem[1]{beaman} Predrag Gruevski, \textit{Towards fearless cargo update} (2022) \\ https://predr.ag/blog/toward-fearless-cargo-update/ +% should this be mentioned? https://elm-lang.org/ + \end{thebibliography} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% From 6f3726f86d882bb73b7ed082f42e441c25abcbee Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Tue, 28 Feb 2023 22:31:37 +0100 Subject: [PATCH 03/35] real life examples, some polishes --- thesis-en.tex | 59 ++++++++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 29 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index 8cf9d05a..74ee6c98 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -319,45 +319,33 @@ \chapter{State of the art}\label{r:chapter_stateoftheart} \section{Problems with using semver in Rust}\label{r:section_usageofsemver} -% TODO: -% - explain why it is easy to break semver in Rust. -% Do that by giving specific, non-obvious code examples. -% - search for sources from which to get examples -% - explain other reasons as to why people tend to break semver -% - don't give yet real-life examples (those will be in the sections under), -% write in a general way -% - make it clear that using semver in Rust is hard - It might seem easy to maintain semver, but some violations are really hard to notice, when not actively searching for them. Let's look at an example. - +\vspace{-3pt} \begin{verbatim} -struct Foo { - x: String -} + struct Foo { + x: String + } -pub struct Bar { - y: Foo -} + pub struct Bar { + y: Foo + } \end{verbatim} +\vspace{-5pt} Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} causes semver break, even though it is a non-public field of a non-public struct. Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits, that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc} implements neither of them, so the change results in publicly visible struct {\ttfamily Bar} losing a trait. -Of course, things can get way more complex. Just for example, having these structs in very different locations complicates keeping track of such behaviours. Section 2.3 will also show that these conditions result in real-life unintended semver breaks. It should be clear by now, that breaking semver on accident is possible. +Of course, things can get way more complex. Just for example, having these structs in very different locations complicates keeping track of such behaviours. A similar error crept into release v3.2.0 of a well-known crate, {\ttfamily clap}. More of that later on in section \ref{r:section_real_life_semver_breaks}. +It should be clear by now, that breaking semver on accident is possible. \section{Consequences of breaking semver} -% TODO: -% - describe that breaking semver means that people's code stops compiling -% - describe the possible scale of catastrophes -% - don't give yet real-life examples, write in a general way - When you publish a new version of a crate, that is breaking semver, you are causing a major inconvenience for the crate's users. Their code might just stop compiling, when the offending version gets downloaded. This also could happen if the crate containing violation is not an immediate dependency, so one semver break, could result in tons of broken crates. Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be really frustrating, and might drive the users to stop using your crate. -\section{Real-life examples of semver breaks} +\section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks} % TODO: % - write (and cite) about cases our mentor mentioned in his blogs @@ -368,6 +356,18 @@ \section{Real-life examples of semver breaks} % a script that scans all releases for the semver breaks % we can detect and the results are presented in some chapter +Some of popular Rust crates with millions of downloads happened to break semver: +\begin{itemize} + \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285} + \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876}; + \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22}; + \item and many more. We have developed a script that scans all releases for semver breaks we can detect, the results are covered in section \ref{r:section_scanning_script} +\end{itemize} + +Of course, the problem is even more prominent in less popular crates, where developers might not be as experienced. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf} +claims that out of the yanked (un-publised) releases, semver break was the leading reason for yanking, with shocking 43\% rate. +It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), are yanked, which should show the scale of the problem - thousands of detected semver breaks. + \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} % TODO: @@ -384,13 +384,13 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se % explain that it's hard to do in popular languages, % especially without features like rustdoc. -There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge enforces semantic versioning. It's type system enables automatic detection of all API changes. +There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge enforces semantic versioning. It's type system enables automatic detection of all API changes. Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. -Unfortunately, the Rust langugage's semantic were not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, cargo-breaking, works on the abstract syntax tree. The problem here is that to compare API changes, you must navigate two trees at once, which can get really complex and tedious, because the abstract syntax tree could change quite a lot without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree might change along with the development of the language, which makes maintenance time-consuming. +Unfortunately, the Rust langugage's semantic were not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, cargo-breaking, works on the abstract syntax tree. The problem here is that to compare API changes, you must navigate two trees at once, which can get really complex and tedious, because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree might change along with the development of the language, which makes maintenance time-consuming. The second existing tool is rust-semverver, which focuses on the metadata present in the rust-specific rlib binary dynamic static library format. Because of that, unfortunately, the user experience is far from ideal, as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited. -In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. Adding new queries is designed to be quite accessible, and the maintaince comes to keeping adapter up to date with rustdoc changes, which seems to be about as low effort as it could be. +In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. Adding new queries is designed to be quite accessible, and the maintaince comes to keeping adapter up to date with rustdoc API changes, which seems to be about as low effort as it could be. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Vision % @@ -606,7 +606,7 @@ \section{Continuous integration improvements} -\section{Script} +\section{Script} @@ -635,11 +635,12 @@ \section{Steady increase in tool's popularity} \item list the maintainers of big libraries that started using the tool during our development \end{itemize} -\section{Script} +\section{Script} \label{r:section_scanning_script} TODO: \begin{itemize} - \item show the results of the script that searches all existing releases for detected semver breaks + \item adjust the name of the subsection + \item show the results of the script that searches all existing releases for detected semver breaks \item describe how our new lints can make an impact on the community based on the found semver breaks from the script \end{itemize} From 87df4b4ae68402a7ae7fb814af531e1c20a7d96e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Tue, 28 Feb 2023 22:32:41 +0100 Subject: [PATCH 04/35] remove todo comments MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 23 ----------------------- 1 file changed, 23 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index 74ee6c98..2be3ece9 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -347,15 +347,6 @@ \section{Consequences of breaking semver} \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks} -% TODO: -% - write (and cite) about cases our mentor mentioned in his blogs -% - write about cases users reported in the github issue -% - mention the paper describing that 43\% of yanked releases -% are because of semver breaks and 3.7\% of all >300'000 releases are yanked -% - mention that we've developed -% a script that scans all releases for the semver breaks -% we can detect and the results are presented in some chapter - Some of popular Rust crates with millions of downloads happened to break semver: \begin{itemize} \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285} @@ -370,20 +361,6 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} -% TODO: -% - list languages which have semver checking built-in, -% explain that the language semantics were made for e.g. semver checking, -% - list current tools for detecting semver breaks in Rust: -% cargo-breaking, rust-semverver, cargo-semver-checks -% - for the first two, explain a bit how they work and why they are no longer maintained. -% Mieszko's slides have some info about that. -% - for cargo-semver-checks, explain a bit how it works (rustdoc, json, etc.) -% and that contrary to the other two, it is maintained and it's made to be easily maintained. -% Mention that this is the project we're working on. -% - research the current state of semver detection in other languages, -% explain that it's hard to do in popular languages, -% especially without features like rustdoc. - There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge enforces semantic versioning. It's type system enables automatic detection of all API changes. Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. Unfortunately, the Rust langugage's semantic were not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, cargo-breaking, works on the abstract syntax tree. The problem here is that to compare API changes, you must navigate two trees at once, which can get really complex and tedious, because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree might change along with the development of the language, which makes maintenance time-consuming. From 2b839e756642a5a4830d243e322204ece5350a31 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Tue, 28 Feb 2023 22:35:01 +0100 Subject: [PATCH 05/35] remove trailing whitespace MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 2be3ece9..f3f1795e 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -583,7 +583,7 @@ \section{Continuous integration improvements} -\section{Script} +\section{Script} From 77e1d6a1867ebcbd586b61b2dbf39bb89e9891bb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Tue, 28 Feb 2023 22:36:26 +0100 Subject: [PATCH 06/35] Add another footnote MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index f3f1795e..d3c2583c 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -361,7 +361,7 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} -There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge enforces semantic versioning. It's type system enables automatic detection of all API changes. Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. +There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning. It's type system enables automatic detection of all API changes. Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. Unfortunately, the Rust langugage's semantic were not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, cargo-breaking, works on the abstract syntax tree. The problem here is that to compare API changes, you must navigate two trees at once, which can get really complex and tedious, because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree might change along with the development of the language, which makes maintenance time-consuming. @@ -708,8 +708,6 @@ \section{Responsibilities} \textit{Towards fearless cargo update} (2022) \\ https://predr.ag/blog/toward-fearless-cargo-update/ -% should this be mentioned? https://elm-lang.org/ - \end{thebibliography} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% From de3cbd902d4dfbb0b14aa5beb539f16671edb46b Mon Sep 17 00:00:00 2001 From: Tomasz Nowak Date: Wed, 1 Mar 2023 10:53:15 +0100 Subject: [PATCH 07/35] Added newlines in tex for easier vim navigation --- thesis-en.tex | 74 ++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 56 insertions(+), 18 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index d3c2583c..ac462a4b 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -319,7 +319,8 @@ \chapter{State of the art}\label{r:chapter_stateoftheart} \section{Problems with using semver in Rust}\label{r:section_usageofsemver} -It might seem easy to maintain semver, but some violations are really hard to notice, when not actively searching for them. Let's look at an example. +It might seem easy to maintain semver, but some violations are really hard to notice, +when not actively searching for them. Let's look at an example. \vspace{-3pt} \begin{verbatim} struct Foo { @@ -332,18 +333,32 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} \end{verbatim} \vspace{-5pt} -Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} causes semver break, even though it is a non-public field of a non-public struct. Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits, that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc} implements neither of them, so the change results in publicly visible struct {\ttfamily Bar} losing a trait. - -Of course, things can get way more complex. Just for example, having these structs in very different locations complicates keeping track of such behaviours. A similar error crept into release v3.2.0 of a well-known crate, {\ttfamily clap}. More of that later on in section \ref{r:section_real_life_semver_breaks}. +Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} +causes semver break, even though it is a non-public field of a non-public struct. +Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits, +that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} +implement {\ttfamily Send} and {\ttfamily Sync}. +In contrary, {\ttfamily Rc} implements neither of them, +so the change results in publicly visible struct {\ttfamily Bar} losing a trait. + +Of course, things can get way more complex. +Just for example, having these structs in very different locations +complicates keeping track of such behaviours. +A similar error crept into release v3.2.0 of a well-known crate, {\ttfamily clap}. +More of that later on in section \ref{r:section_real_life_semver_breaks}. It should be clear by now, that breaking semver on accident is possible. \section{Consequences of breaking semver} -When you publish a new version of a crate, that is breaking semver, you are causing a major inconvenience for the crate's users. +When you publish a new version of a crate, that is breaking semver, +you are causing a major inconvenience for the crate's users. Their code might just stop compiling, when the offending version gets downloaded. -This also could happen if the crate containing violation is not an immediate dependency, so one semver break, could result in tons of broken crates. +This also could happen if the crate containing violation is not an immediate dependency, +so one semver break, could result in tons of broken crates. -Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be really frustrating, and might drive the users to stop using your crate. +Debugging a cryptic compilation error that starts showing up one day, +without any change to the code, can be really frustrating, +and might drive the users to stop using your crate. \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks} @@ -352,22 +367,45 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285} \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876}; \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22}; - \item and many more. We have developed a script that scans all releases for semver breaks we can detect, the results are covered in section \ref{r:section_scanning_script} + \item and many more. We have developed a script that scans all releases + for semver breaks we can detect, the results are covered in section \ref{r:section_scanning_script} \end{itemize} -Of course, the problem is even more prominent in less popular crates, where developers might not be as experienced. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf} -claims that out of the yanked (un-publised) releases, semver break was the leading reason for yanking, with shocking 43\% rate. -It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), are yanked, which should show the scale of the problem - thousands of detected semver breaks. +Of course, the problem is even more prominent in less popular crates, +where developers might not be as experienced. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf} +claims that out of the yanked (un-publised) releases, +semver break was the leading reason for yanking, with shocking 43\% rate. +It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), +are yanked, which should show the scale of the problem - thousands of detected semver breaks. \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} -There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning. It's type system enables automatic detection of all API changes. Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. - -Unfortunately, the Rust langugage's semantic were not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, cargo-breaking, works on the abstract syntax tree. The problem here is that to compare API changes, you must navigate two trees at once, which can get really complex and tedious, because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree might change along with the development of the language, which makes maintenance time-consuming. - -The second existing tool is rust-semverver, which focuses on the metadata present in the rust-specific rlib binary dynamic static library format. Because of that, unfortunately, the user experience is far from ideal, as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited. - -In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. Adding new queries is designed to be quite accessible, and the maintaince comes to keeping adapter up to date with rustdoc API changes, which seems to be about as low effort as it could be. +There aren't many great existing tools for semver checking. +The main reason for that, is that the semantics of popular languages +do not allow for complete automatic verification. +Of course, there are some initiatives to combat this - for example, +the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning. +It's type system enables automatic detection of all API changes. +Outside of that, it does not appear that tools for checking semver +in estabilished languages like Python or C++ are commonly used in the industry. + +Unfortunately, the Rust langugage's semantic were not designed with semver in mind. +Despite this, there are some existing tools for semver checking. +First of them, cargo-breaking, works on the abstract syntax tree. +The problem here is that to compare API changes, you must navigate two trees at once, +which can get really complex and tedious, because the abstract syntax tree could change quite a lot, +even without any public API changes. +Another issue is that both language syntax and the structure of the abstract syntax tree +might change along with the development of the language, which makes maintenance time-consuming. + +The second existing tool is rust-semverver, which focuses on +the metadata present in the rust-specific rlib binary dynamic static library format. +Because of that, unfortunately, the user experience is far from ideal, +as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited. + +In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. +Adding new queries is designed to be quite accessible, and the maintaince comes to +keeping adapter up to date with rustdoc API changes, which seems to be about as low effort as it could be. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Vision % From 98782db087b94fe8b14261805e28bec47733a832 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Fri, 3 Mar 2023 13:18:33 +0100 Subject: [PATCH 08/35] Review adjustments --- thesis-en.tex | 47 +++++++++++++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 20 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index ac462a4b..2f3ffcd1 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -319,7 +319,7 @@ \chapter{State of the art}\label{r:chapter_stateoftheart} \section{Problems with using semver in Rust}\label{r:section_usageofsemver} -It might seem easy to maintain semver, but some violations are really hard to notice, +It might seem easy to maintain semver, but some violations are hard to notice when not actively searching for them. Let's look at an example. \vspace{-3pt} \begin{verbatim} @@ -335,53 +335,60 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} causes semver break, even though it is a non-public field of a non-public struct. -Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits, +That's because {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc} implements neither of them, so the change results in publicly visible struct {\ttfamily Bar} losing a trait. -Of course, things can get way more complex. -Just for example, having these structs in very different locations -complicates keeping track of such behaviours. -A similar error crept into release v3.2.0 of a well-known crate, {\ttfamily clap}. +The given example is not only non-obvious, but also is even harder to notice +in large codebases, where those struct could be in very different locations. +In fact, a similar error crept into release v3.2.0 of a well-known crate +maintained by the Rust team -- {\ttfamily clap}. More of that later on in section \ref{r:section_real_life_semver_breaks}. +% TODO: add another example It should be clear by now, that breaking semver on accident is possible. \section{Consequences of breaking semver} -When you publish a new version of a crate, that is breaking semver, +When you publish a new version of a crate that is breaking semver, you are causing a major inconvenience for the crate's users. -Their code might just stop compiling, when the offending version gets downloaded. +Their code might just stop compiling when the offending version gets downloaded. This also could happen if the crate containing violation is not an immediate dependency, -so one semver break, could result in tons of broken crates. +so one semver break could result in tons of broken crates. Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be really frustrating, and might drive the users to stop using your crate. +Because of that, maintainers have to yank +the incorrect releases as soon as possible +-- otherwise more users would encounter this problem and their trust +in this crate (and crates using it as a dependency) +would decrease. + \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks} Some of popular Rust crates with millions of downloads happened to break semver: \begin{itemize} - \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285} - \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876}; - \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22}; + \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285}, + \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876}, + \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22}, \item and many more. We have developed a script that scans all releases - for semver breaks we can detect, the results are covered in section \ref{r:section_scanning_script} + for semver breaks we can detect. The results are covered in section \ref{r:section_scanning_script} \end{itemize} -Of course, the problem is even more prominent in less popular crates, -where developers might not be as experienced. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf} +Those were examples of popular crates with experienced maintainers, but the problem is even more prominent in less popular crates +where developers might not know the common semver pitfalls. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf} claims that out of the yanked (un-publised) releases, -semver break was the leading reason for yanking, with shocking 43\% rate. +semver break was the leading reason for yanking, with a shocking 43\% rate. It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), -are yanked, which should show the scale of the problem - thousands of detected semver breaks. +are yanked, which shows the scale of the problem - thousands of detected semver breaks. \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} There aren't many great existing tools for semver checking. -The main reason for that, is that the semantics of popular languages +The main reason for that is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning. @@ -404,8 +411,8 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited. In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. -Adding new queries is designed to be quite accessible, and the maintaince comes to -keeping adapter up to date with rustdoc API changes, which seems to be about as low effort as it could be. +Adding new queries is designed to be quite accessible, and the maintenance comes to +keeping up to date with rustdoc API changes, which seems to be about as low effort as it could be. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Vision % From b516c695f65efe5c2eb29953a9d7812a5af59509 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Wed, 8 Mar 2023 15:08:03 +0100 Subject: [PATCH 09/35] Review adjustments continued MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 77 +++++++++++++++++++++++++-------------------------- 1 file changed, 38 insertions(+), 39 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index 2f3ffcd1..207ca8a2 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -333,12 +333,12 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} \end{verbatim} \vspace{-5pt} -Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} -causes semver break, even though it is a non-public field of a non-public struct. +Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} +causes semver break, even though it is a non-public field of a non-public struct. That's because {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits -that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} -implement {\ttfamily Send} and {\ttfamily Sync}. -In contrary, {\ttfamily Rc} implements neither of them, +that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} +implement {\ttfamily Send} and {\ttfamily Sync}. +In contrary, {\ttfamily Rc} implements neither of them, so the change results in publicly visible struct {\ttfamily Bar} losing a trait. The given example is not only non-obvious, but also is even harder to notice @@ -352,20 +352,19 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} \section{Consequences of breaking semver} When you publish a new version of a crate that is breaking semver, -you are causing a major inconvenience for the crate's users. +you are causing a major inconvenience for the crate's users. Their code might just stop compiling when the offending version gets downloaded. -This also could happen if the crate containing violation is not an immediate dependency, +This also could happen if the crate containing violation is not an immediate dependency, so one semver break could result in tons of broken crates. -Debugging a cryptic compilation error that starts showing up one day, -without any change to the code, can be really frustrating, -and might drive the users to stop using your crate. +Debugging a cryptic compilation error that starts showing up one day, +without any change to the code, can be really frustrating. Actually, we have experienced it during our contributions, as one of the dependencies broke semver. This is a major problem, as it might drive the users to stop using your crate. Because of that, maintainers have to yank the incorrect releases as soon as possible -- otherwise more users would encounter this problem and their trust in this crate (and crates using it as a dependency) -would decrease. +would decrease. Even though yanking the release seems easy, fixing the semver break could also result in a lot of additional work for the maintainers -- they have to investigate the semver break when it is reported, inform the users about the yanking and possibly help some move away from the faulty release. \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks} @@ -374,45 +373,45 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285}, \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876}, \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22}, - \item and many more. We have developed a script that scans all releases + \item and many more. We have developed a script that scans all releases for semver breaks we can detect. The results are covered in section \ref{r:section_scanning_script} \end{itemize} Those were examples of popular crates with experienced maintainers, but the problem is even more prominent in less popular crates where developers might not know the common semver pitfalls. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf} -claims that out of the yanked (un-publised) releases, +claims that out of the yanked (un-publised) releases, semver break was the leading reason for yanking, with a shocking 43\% rate. -It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), -are yanked, which shows the scale of the problem - thousands of detected semver breaks. +It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), +are yanked, which shows the scale of the problem -- thousands of detected semver breaks. \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} -There aren't many great existing tools for semver checking. +There aren't many great existing tools for semver checking. The main reason for that is that the semantics of popular languages -do not allow for complete automatic verification. -Of course, there are some initiatives to combat this - for example, -the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning. -It's type system enables automatic detection of all API changes. -Outside of that, it does not appear that tools for checking semver +do not allow for complete automatic verification. +There are some initiatives to combat this. For example, +the Elm languge\footnote{https://elm-lang.org/} by design enforces semantic versioning. +Its type system enables automatic detection of all API changes. +Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. -Unfortunately, the Rust langugage's semantic were not designed with semver in mind. -Despite this, there are some existing tools for semver checking. -First of them, cargo-breaking, works on the abstract syntax tree. -The problem here is that to compare API changes, you must navigate two trees at once, -which can get really complex and tedious, because the abstract syntax tree could change quite a lot, -even without any public API changes. -Another issue is that both language syntax and the structure of the abstract syntax tree -might change along with the development of the language, which makes maintenance time-consuming. - -The second existing tool is rust-semverver, which focuses on -the metadata present in the rust-specific rlib binary dynamic static library format. -Because of that, unfortunately, the user experience is far from ideal, -as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited. - -In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. +Unfortunately, the Rust langugage's semantic were also not designed with semver in mind. +Despite this, there are some existing tools for semver checking. +First of them, \texttt{cargo-breaking}, works on the abstract syntax tree. +The problem here is that to compare API changes, you must navigate two trees at once, +which can get really complex and tedious (especially when checking for moved or removed items), because the abstract syntax tree could change quite a lot, +even without any public API changes. +Another issue is that both language syntax and the structure of the abstract syntax tree +often change along with the development of the language, which makes maintenance time-consuming. + +The second existing tool is \texttt{rust-semverver}, which focuses on +the metadata present in the rust-specific rlib binary static library format. +Because of that, unfortunately, the user experience is far from ideal, +as it forces the user to use some specific unstable versions of the language, and the quality of error messages is limited. + +In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. Adding new queries is designed to be quite accessible, and the maintenance comes to -keeping up to date with rustdoc API changes, which seems to be about as low effort as it could be. +keeping up with rustdoc API changes, which seems to be about as low effort as it could be. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Vision % @@ -541,7 +540,7 @@ \section{Project baseline} \item some existing lints had false-positives, \item the codebase was not in a state where new contributors could easily begin making changes to the project (which is crucial for the project to flourish in the long term). - For example, adding new lints and tests wasn't intuitive and required many manual steps, + For example, adding new lints and tests wasn't intuitive and required many manual steps, the filenames and variable names were not always descriptive enough and the code lacked comments that explained some of the logic and decisions behind it. \end{itemize} @@ -749,7 +748,7 @@ \section{Responsibilities} https://semver.org/ % State of the art references: -\bibitem[1]{beaman} Predrag Gruevski, +\bibitem[1]{beaman} Predrag Gruevski, \textit{Towards fearless cargo update} (2022) \\ https://predr.ag/blog/toward-fearless-cargo-update/ From 0c48d29c86ab07af3d0d5846cfb97373f6f821f0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Thu, 9 Mar 2023 00:37:54 +0100 Subject: [PATCH 10/35] Minor changes, moving to cite from footnote MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 38 +++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index 207ca8a2..803fe463 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -339,7 +339,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc} implements neither of them, -so the change results in publicly visible struct {\ttfamily Bar} losing a trait. +so the change results in a publicly visible struct {\ttfamily Bar} losing a trait. The given example is not only non-obvious, but also is even harder to notice in large codebases, where those struct could be in very different locations. @@ -354,11 +354,11 @@ \section{Consequences of breaking semver} When you publish a new version of a crate that is breaking semver, you are causing a major inconvenience for the crate's users. Their code might just stop compiling when the offending version gets downloaded. -This also could happen if the crate containing violation is not an immediate dependency, +This also could happen if the crate containing the violation is not an immediate dependency, so one semver break could result in tons of broken crates. Debugging a cryptic compilation error that starts showing up one day, -without any change to the code, can be really frustrating. Actually, we have experienced it during our contributions, as one of the dependencies broke semver. This is a major problem, as it might drive the users to stop using your crate. +without any change to the code, can be frustrating. In fact, we have experienced it during our contributions, as one of our dependencies broke semver. This is a major problem, as it might drive the users to stop using your crate. Because of that, maintainers have to yank the incorrect releases as soon as possible @@ -370,15 +370,15 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ Some of popular Rust crates with millions of downloads happened to break semver: \begin{itemize} - \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285}, + \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature \footnote{https://github.com/PyO3/pyo3/issues/285}, \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876}, \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22}, \item and many more. We have developed a script that scans all releases for semver breaks we can detect. The results are covered in section \ref{r:section_scanning_script} \end{itemize} -Those were examples of popular crates with experienced maintainers, but the problem is even more prominent in less popular crates -where developers might not know the common semver pitfalls. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf} +Those were examples of popular crates with experienced maintainers, but the problem is even more prominent in less used crates +where developers might not know the common semver pitfalls. A paper \cite{paper} claims that out of the yanked (un-publised) releases, semver break was the leading reason for yanking, with a shocking 43\% rate. It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), @@ -390,7 +390,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se The main reason for that is that the semantics of popular languages do not allow for complete automatic verification. There are some initiatives to combat this. For example, -the Elm languge\footnote{https://elm-lang.org/} by design enforces semantic versioning. +the Elm languge\cite{elm-lang} by design enforces semantic versioning. Its type system enables automatic detection of all API changes. Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. @@ -398,19 +398,20 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se Unfortunately, the Rust langugage's semantic were also not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, \texttt{cargo-breaking}, works on the abstract syntax tree. -The problem here is that to compare API changes, you must navigate two trees at once, -which can get really complex and tedious (especially when checking for moved or removed items), because the abstract syntax tree could change quite a lot, +Although ASTs contain all the information needed for comparing API changes, +it has a major drawback -- you must navigate two trees at once. +It can get complex and tedious (especially when checking for moved or removed items), because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree -often change along with the development of the language, which makes maintenance time-consuming. +often changes along with the development of the language, which makes maintenance time-consuming. The second existing tool is \texttt{rust-semverver}, which focuses on the metadata present in the rust-specific rlib binary static library format. Because of that, unfortunately, the user experience is far from ideal, as it forces the user to use some specific unstable versions of the language, and the quality of error messages is limited. -In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. -Adding new queries is designed to be quite accessible, and the maintenance comes to +In comparison, the cargo-semver-checks' approach to write lints as queries, seems to work really well. +Adding new queries is designed to be accessible, and the maintenance comes to keeping up with rustdoc API changes, which seems to be about as low effort as it could be. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -747,13 +748,20 @@ \section{Responsibilities} \textit{Semantic Versioning 2.0.0} (2022) \\ https://semver.org/ -% State of the art references: -\bibitem[1]{beaman} Predrag Gruevski, +\bibitem{paper} Hao Li, Filpe R Cogo, Cor-Paul Bezemer, \\ + \textit{An Empirical Study of Yanked Releases in the Rust Package Registry} + (2022) \\ https://arxiv.org/pdf/2201.11821.pdf + +\bibitem{fearless-cargo-update} Predrag Gruevski, \textit{Towards fearless cargo update} (2022) \\ https://predr.ag/blog/toward-fearless-cargo-update/ -\end{thebibliography} +\bibitem{elm-lang} Evan Czaplicki, + \textit{Elm Programming Language} (2021) \\ + https://elm-lang.org/ + +\end{thebibliography} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Attachments % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% From eb04c98e914618760b9684cf9e15e6962dc453fb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Thu, 9 Mar 2023 00:48:36 +0100 Subject: [PATCH 11/35] Remove 'you' MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index 803fe463..f8a3781c 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -351,8 +351,8 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} \section{Consequences of breaking semver} -When you publish a new version of a crate that is breaking semver, -you are causing a major inconvenience for the crate's users. +When a maintainer publishes a new version of a crate that is breaking semver, +it is causing a major inconvenience for the crate's users. Their code might just stop compiling when the offending version gets downloaded. This also could happen if the crate containing the violation is not an immediate dependency, so one semver break could result in tons of broken crates. From e1812e585ed83e251f1c5290bebc2696b39cb0d3 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak Date: Thu, 9 Mar 2023 08:36:52 +0100 Subject: [PATCH 12/35] Resolved conversations --- thesis-en.tex | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index f8a3781c..a9489bd9 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -346,8 +346,13 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} In fact, a similar error crept into release v3.2.0 of a well-known crate maintained by the Rust team -- {\ttfamily clap}. More of that later on in section \ref{r:section_real_life_semver_breaks}. -% TODO: add another example -It should be clear by now, that breaking semver on accident is possible. + +The same issue almost happened +(but has been prevented thanks to our tool) +in another common library \texttt{rust-libp2p}, +in which from the conversation \cite{issue-libp2p} it's clear that the maintainers +were not expecting their type to stop being \texttt{UnwindSafe} and were likely not even aware that +their type was publicly \texttt{UnwindSafe} to start with. \section{Consequences of breaking semver} @@ -358,7 +363,8 @@ \section{Consequences of breaking semver} so one semver break could result in tons of broken crates. Debugging a cryptic compilation error that starts showing up one day, -without any change to the code, can be frustrating. In fact, we have experienced it during our contributions, as one of our dependencies broke semver. This is a major problem, as it might drive the users to stop using your crate. +without any change to the code, can be frustrating. In fact, we have experienced it during our contributions +(one of the tool's users opened a GitHub Issue \cite{issue-compiling-fails}), as one of our dependencies broke semver. This is a major problem, as it might drive the users to stop using your crate. Because of that, maintainers have to yank the incorrect releases as soon as possible @@ -388,7 +394,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se There aren't many great existing tools for semver checking. The main reason for that is that the semantics of popular languages -do not allow for complete automatic verification. +makes it that complete automatic verification is practically impossible. There are some initiatives to combat this. For example, the Elm languge\cite{elm-lang} by design enforces semantic versioning. Its type system enables automatic detection of all API changes. @@ -407,7 +413,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se The second existing tool is \texttt{rust-semverver}, which focuses on the metadata present in the rust-specific rlib binary static library format. -Because of that, unfortunately, the user experience is far from ideal, +Because of that, the user experience is far from ideal, as it forces the user to use some specific unstable versions of the language, and the quality of error messages is limited. In comparison, the cargo-semver-checks' approach to write lints as queries, seems to work really well. @@ -715,6 +721,8 @@ \section{Responsibilities} % function}, Mathematica Absurdica, 117 (1965) 338--9. \bibitem{issue-merge-cargo} \href{}{GitHub cargo-semver-checks issue \#61: Prepare for merging into cargo} \bibitem{issue-cli-interface} \href{}{GitHub cargo-semver-checks issue \#86 What should the CLI look like?} +\bibitem{issue-compiling-fails} \href{}{GitHub cargo-semver-checks issue \#317: compiling semver-checks fails} +\bibitem{issue-libp2p} \href{}{GitHub rust-libp2p issue \#3312: feat: migrate to quick-protobuf} \bibitem{Rust-1} Rust Team, \textit{Rust Programming Language} (2023) \\ From 8b54ef68a3fbe9ccaed60eb19207977185525685 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:07:01 +0100 Subject: [PATCH 13/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index a9489bd9..9d238c49 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -405,7 +405,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se Despite this, there are some existing tools for semver checking. First of them, \texttt{cargo-breaking}, works on the abstract syntax tree. Although ASTs contain all the information needed for comparing API changes, -it has a major drawback -- you must navigate two trees at once. +it has a major drawback -- two trees must be navigated at once. It can get complex and tedious (especially when checking for moved or removed items), because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree From 1e3a46d9b9609aefad49c848484887c3fec16fb5 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:08:12 +0100 Subject: [PATCH 14/35] Update thesis-en.tex --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 9d238c49..3426091f 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -364,7 +364,7 @@ \section{Consequences of breaking semver} Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be frustrating. In fact, we have experienced it during our contributions -(one of the tool's users opened a GitHub Issue \cite{issue-compiling-fails}), as one of our dependencies broke semver. This is a major problem, as it might drive the users to stop using your crate. +(one of the tool's users opened a GitHub Issue \cite{issue-compiling-fails}), as one of our dependencies broke semver. This is a major problem, as it might drive the users to stop using such crate. Because of that, maintainers have to yank the incorrect releases as soon as possible From 1a2e5f6d3bd3b96c207e60e417f0a4e97035443d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Staniewski?= Date: Thu, 9 Mar 2023 09:25:54 +0100 Subject: [PATCH 15/35] Remove all remaining uses of \footnote{} MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michał Staniewski --- thesis-en.tex | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/thesis-en.tex b/thesis-en.tex index 3426091f..fc995ce8 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -376,9 +376,9 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ Some of popular Rust crates with millions of downloads happened to break semver: \begin{itemize} - \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature \footnote{https://github.com/PyO3/pyo3/issues/285}, - \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876}, - \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22}, + \item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature \cite{pyo3-issue} + \item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait \cite{clap-issue} + \item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract \cite{block-buffer-issue} \item and many more. We have developed a script that scans all releases for semver breaks we can detect. The results are covered in section \ref{r:section_scanning_script} \end{itemize} @@ -768,6 +768,17 @@ \section{Responsibilities} \textit{Elm Programming Language} (2021) \\ https://elm-lang.org/ +\bibitem{pyo3-issue} + \textit{Github PyO3 issue \#285} (2018) \\ + https://github.com/PyO3/pyo3/issues/285 + +\bibitem{clap-issue} + \textit{Github clap issue \#3876} (2022) \\ + https://github.com/clap-rs/clap/issues/3876 + +\bibitem{block-buffer-issue} + \textit{Github RustCrypto issue \#22} \\ + https://github.com/RustCrypto/utils/issues/22 \end{thebibliography} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% From bfa83d6937ab2006e2cc2361ff566c2970e1bb18 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:31:23 +0100 Subject: [PATCH 16/35] Update thesis-en.tex --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index fc995ce8..d17c917e 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -343,7 +343,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} The given example is not only non-obvious, but also is even harder to notice in large codebases, where those struct could be in very different locations. -In fact, a similar error crept into release v3.2.0 of a well-known crate +In fact, a similar error crept into the release v3.2.0 of a well-known crate maintained by the Rust team -- {\ttfamily clap}. More of that later on in section \ref{r:section_real_life_semver_breaks}. From 87f7d095e5d45bafb25b84d3fbc0ce0c07a3bc51 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:31:32 +0100 Subject: [PATCH 17/35] Update thesis-en.tex --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index d17c917e..9c501753 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -416,7 +416,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se Because of that, the user experience is far from ideal, as it forces the user to use some specific unstable versions of the language, and the quality of error messages is limited. -In comparison, the cargo-semver-checks' approach to write lints as queries, seems to work really well. +In comparison, the cargo-semver-checks' approach to write lints as queries seems to work really well. Adding new queries is designed to be accessible, and the maintenance comes to keeping up with rustdoc API changes, which seems to be about as low effort as it could be. From 10659e606c7d343215f9d8b26c53ac504e47e19c Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:31:43 +0100 Subject: [PATCH 18/35] Update thesis-en.tex --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 9c501753..743e146d 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -387,7 +387,7 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ where developers might not know the common semver pitfalls. A paper \cite{paper} claims that out of the yanked (un-publised) releases, semver break was the leading reason for yanking, with a shocking 43\% rate. -It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), +It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already) are yanked, which shows the scale of the problem -- thousands of detected semver breaks. \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} From 66bcc1ca2c7d770f6fb08c8b9048812cdce19bb2 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:32:02 +0100 Subject: [PATCH 19/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 743e146d..8e23b100 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -320,7 +320,7 @@ \chapter{State of the art}\label{r:chapter_stateoftheart} \section{Problems with using semver in Rust}\label{r:section_usageofsemver} It might seem easy to maintain semver, but some violations are hard to notice -when not actively searching for them. Let's look at an example. +when not actively searched for. Consider the following example: \vspace{-3pt} \begin{verbatim} struct Foo { From 0dd3b5c8e7bd28a6e71d90ba0e8633dcd915da7d Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:32:12 +0100 Subject: [PATCH 20/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 8e23b100..d24333a7 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -335,7 +335,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc} causes semver break, even though it is a non-public field of a non-public struct. -That's because {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits +That is because {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc} implements neither of them, From 61b7df59b45fff8ad9ee3c5ba63d6d1ce7eb1451 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:32:24 +0100 Subject: [PATCH 21/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index d24333a7..33c849c5 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -341,7 +341,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} In contrary, {\ttfamily Rc} implements neither of them, so the change results in a publicly visible struct {\ttfamily Bar} losing a trait. -The given example is not only non-obvious, but also is even harder to notice +The given example is not only unobvious, but also even harder to notice in large codebases, where those struct could be in very different locations. In fact, a similar error crept into the release v3.2.0 of a well-known crate maintained by the Rust team -- {\ttfamily clap}. From bea746756ddf6ae3d42b78a3b2abf31f79a71a66 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:32:33 +0100 Subject: [PATCH 22/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 33c849c5..1a593ac2 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -342,7 +342,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} so the change results in a publicly visible struct {\ttfamily Bar} losing a trait. The given example is not only unobvious, but also even harder to notice -in large codebases, where those struct could be in very different locations. +in large codebases, where those structs could be in completely different locations. In fact, a similar error crept into the release v3.2.0 of a well-known crate maintained by the Rust team -- {\ttfamily clap}. More of that later on in section \ref{r:section_real_life_semver_breaks}. From 5e8e4c7472bc73889f71f516bfd2afbe83723936 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:32:45 +0100 Subject: [PATCH 23/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 1a593ac2..5a5c0035 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -345,7 +345,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} in large codebases, where those structs could be in completely different locations. In fact, a similar error crept into the release v3.2.0 of a well-known crate maintained by the Rust team -- {\ttfamily clap}. -More of that later on in section \ref{r:section_real_life_semver_breaks}. +More details about it can be found in section \ref{r:section_real_life_semver_breaks}. The same issue almost happened (but has been prevented thanks to our tool) From d717c3198ab98b68f114810dede00b439bb43c68 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:36:41 +0100 Subject: [PATCH 24/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 5a5c0035..8fed24ad 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -350,7 +350,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} The same issue almost happened (but has been prevented thanks to our tool) in another common library \texttt{rust-libp2p}, -in which from the conversation \cite{issue-libp2p} it's clear that the maintainers +where it is clear from the conversation \cite{issue-libp2p} that the maintainers were not expecting their type to stop being \texttt{UnwindSafe} and were likely not even aware that their type was publicly \texttt{UnwindSafe} to start with. From 1e169ac197e9257e16a462fedcc477d00f074910 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:36:55 +0100 Subject: [PATCH 25/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 8fed24ad..f3c84f04 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -417,7 +417,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se as it forces the user to use some specific unstable versions of the language, and the quality of error messages is limited. In comparison, the cargo-semver-checks' approach to write lints as queries seems to work really well. -Adding new queries is designed to be accessible, and the maintenance comes to +Adding new queries is designed to be accessible, and the maintenance comes down to keeping up with rustdoc API changes, which seems to be about as low effort as it could be. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% From f4150f778c1e07f5625133f370a1f0152ef5a147 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:37:11 +0100 Subject: [PATCH 26/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index f3c84f04..0d473782 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -356,7 +356,7 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver} \section{Consequences of breaking semver} -When a maintainer publishes a new version of a crate that is breaking semver, +When a maintainer publishes a new version of their crate that is breaking semver, it is causing a major inconvenience for the crate's users. Their code might just stop compiling when the offending version gets downloaded. This also could happen if the crate containing the violation is not an immediate dependency, From 24bb968578a51d055e84e34cb85bfbdc76886a99 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:38:01 +0100 Subject: [PATCH 27/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 0d473782..06f2bdad 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -359,7 +359,7 @@ \section{Consequences of breaking semver} When a maintainer publishes a new version of their crate that is breaking semver, it is causing a major inconvenience for the crate's users. Their code might just stop compiling when the offending version gets downloaded. -This also could happen if the crate containing the violation is not an immediate dependency, +This could also happen if the crate containing the violation is not an immediate dependency, so one semver break could result in tons of broken crates. Debugging a cryptic compilation error that starts showing up one day, From a26bb9b145bffcadd7714eac91b983101dd747b4 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:38:15 +0100 Subject: [PATCH 28/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 06f2bdad..0bbdf9cb 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -360,7 +360,7 @@ \section{Consequences of breaking semver} it is causing a major inconvenience for the crate's users. Their code might just stop compiling when the offending version gets downloaded. This could also happen if the crate containing the violation is not an immediate dependency, -so one semver break could result in tons of broken crates. +so one semver break could result in tons of other broken crates. Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be frustrating. In fact, we have experienced it during our contributions From e0beeaed6e0ffbcec01ca31ac40919de01e0f189 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:38:38 +0100 Subject: [PATCH 29/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 0bbdf9cb..f1a9a6ea 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -369,7 +369,7 @@ \section{Consequences of breaking semver} Because of that, maintainers have to yank the incorrect releases as soon as possible -- otherwise more users would encounter this problem and their trust -in this crate (and crates using it as a dependency) +in this particular crate (and crates using it as a dependency) would decrease. Even though yanking the release seems easy, fixing the semver break could also result in a lot of additional work for the maintainers -- they have to investigate the semver break when it is reported, inform the users about the yanking and possibly help some move away from the faulty release. \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks} From c90a7b4c4220db642de1e49b50aaf9f1c5418da2 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:39:35 +0100 Subject: [PATCH 30/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index f1a9a6ea..325168f6 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -392,7 +392,7 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_ \section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools} -There aren't many great existing tools for semver checking. +There are not many great tools for semver checking in existence. The main reason for that is that the semantics of popular languages makes it that complete automatic verification is practically impossible. There are some initiatives to combat this. For example, From e7a31328f498cf2e868f7ad41b82d25eb581c0ca Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:41:13 +0100 Subject: [PATCH 31/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 325168f6..e9ca0087 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -401,7 +401,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry. -Unfortunately, the Rust langugage's semantic were also not designed with semver in mind. +Unfortunately, the Rust language's semantics were also not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, \texttt{cargo-breaking}, works on the abstract syntax tree. Although ASTs contain all the information needed for comparing API changes, From 7a4c9cec8c4bf7f7ad12c4703777b46de605abe3 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:43:28 +0100 Subject: [PATCH 32/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index e9ca0087..fc615a04 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -394,7 +394,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se There are not many great tools for semver checking in existence. The main reason for that is that the semantics of popular languages -makes it that complete automatic verification is practically impossible. +make complete and automatic verification practically impossible. There are some initiatives to combat this. For example, the Elm languge\cite{elm-lang} by design enforces semantic versioning. Its type system enables automatic detection of all API changes. From a82e4931c4631cc1f593246ad860b6bb473c9081 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:44:27 +0100 Subject: [PATCH 33/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index fc615a04..38020524 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -409,7 +409,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se It can get complex and tedious (especially when checking for moved or removed items), because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree -often changes along with the development of the language, which makes maintenance time-consuming. +often change along with the development of the language, which makes maintenance time-consuming. The second existing tool is \texttt{rust-semverver}, which focuses on the metadata present in the rust-specific rlib binary static library format. From b11e07d460ba06ea71bc50b7ec8a064ae58a32c3 Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:45:12 +0100 Subject: [PATCH 34/35] Update thesis-en.tex Co-authored-by: Bartosz Smolarczyk <92160712+SmolSir@users.noreply.github.com> --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 38020524..42b4ed99 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -547,7 +547,7 @@ \section{Project baseline} \item some existing lints had false-positives, \item the codebase was not in a state where new contributors could easily begin making changes to the project (which is crucial for the project to flourish in the long term). - For example, adding new lints and tests wasn't intuitive and required many manual steps, + For example, adding new lints and tests was not intuitive and required many manual steps, the filenames and variable names were not always descriptive enough and the code lacked comments that explained some of the logic and decisions behind it. \end{itemize} From 3f37094c3516a946fe8e236a406810b1eb3c039c Mon Sep 17 00:00:00 2001 From: Tomasz Nowak <36604952+tonowak@users.noreply.github.com> Date: Thu, 9 Mar 2023 09:47:07 +0100 Subject: [PATCH 35/35] Update thesis-en.tex --- thesis-en.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis-en.tex b/thesis-en.tex index 42b4ed99..d190d2d9 100644 --- a/thesis-en.tex +++ b/thesis-en.tex @@ -417,7 +417,7 @@ \section{Existing tools for detecting semver breaks}\label{r:section_existing_se as it forces the user to use some specific unstable versions of the language, and the quality of error messages is limited. In comparison, the cargo-semver-checks' approach to write lints as queries seems to work really well. -Adding new queries is designed to be accessible, and the maintenance comes down to +Adding new queries is designed to be accessible and the maintenance comes down to keeping up with rustdoc API changes, which seems to be about as low effort as it could be. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%