-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Octave language with its own language distinct from MATLAB #6936
base: main
Are you sure you want to change the base?
Conversation
Octave is a "largely Matlab-compatible" language, but it has several syntax elements distinct from Matlab, and which are in common use by Octave coders and encouraged as conventional Octave style. This grammar supports Octave with those variants, and without new Matlab syntax elements which are not present in Octave.
Add language: Octave, as distinct from Matlab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline comments.
Additionally, several of your samples are too big for our needs (the ones suppressed in the diff). Please replace them with smaller samples.
You also need to add the cached license file the add-grammar
script should have created when you ran it.
If you can't find it or it didn't get create it, you can do it manually by running:
Line 134 in 39fd5e9
bundle exec licensed cache -c vendor/licenses/config.yml |
Spotted this PR and just wanted to add that, while octave and matlab do indeed share an odd relationship, matlab is not the only language with which github language detection clashes with, particularly when ".m" files are involved, which are also used by objective-c, wolfram mathematica, and mercury-lang as well. So I would say this is less a case of "making octave distinct from matlab" specifically per se, but more about adding octave as a proper language, and disambiguating octave files among all the above languages that also use the ".m" extension, including matlab (whose syntax is indeed different, despite the many similarities). E.g. if one searches github for "language:Objective-C octave" (or indeed "language:Objective-C matlab") one will witness this confusion, and that it's not necessarily octave/matlab-exclusive. |
@lildude I have made all necessary changes to the best of my understanding of the errors produced from the automated tests. Please, run once again the testing worklfow and give me some guidance about the necessary changes that might still be required. Thank you very much for your help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking better. Just a few more suggestions.
@lildude Are there any other changes required before this PR can be merged? |
Yup. See the test failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the distinction between MATLAB and Octave that great that Octave projects don't contribute to the usage statistics of MATLAB, and vice versa? If not, then the new Octave
entry should be grouped under MATLAB
(using a group: MATLAB
field).
Otherwise, this will make a mess of classification: I can already see projects being classified as a "mixture" of MATLAB and Octave files, purely because each file was analysed in isolation, as per Linguist's implementation.
Grouping won't stop you from selecting a more accurate grammar to highlight Octave source (and by extension, CodeMirror modes). In fact, this is a large part of why this language "grouping" feature exists.
Description
This PR is adding a distinction between MATLAB and GNU Octave languages, which both use
.m
files for their code files. Although GNU Octave is often described as the open source MATLAB alternative due to their similarities, the reality is that it is a separate programming language. Most importantly, although MATLAB syntax can be interpreted by GNU Octave, the opposite is not true. Octave syntax includes#
for comment lines, andendif
,endfor
,endfunction
,endclassdef
etc as block's closing statements, which are explicitly recognized by the Octave interpreter, but cannot be parsed/run in MATLAB. As a result of these differences, GitHub falsely recognizes Octave code as MATLAB and by using MATLAB's syntax highlighting grammar results into mis-displaying code, which is written explicitly for the Octave language.Besides the mis-highlighted code syntax of files containing Octave code, this is also confusing for users who might consider that code written for Octave may work in MATLAB, since GitHub reports all Octave code as MATLAB. GNU Octave is a community driven open source project, which has been in constant development for more than 30 years now, and with a reasonably large user base across academic and industrial sectors, especially in engineering fields. Thus, there is also an ethical aspect towards our community of developers and users that the Octave language should not be misclassified as another proprietary language by the worlds largest developer platform, GitHub, which gratefully provides free hosting for a myriad of open source projects.
Checklist:
NOT is:fork path:*.m AND (/^\s*endfor/ OR /^\s*endfunction/ OR /^\s*endclassdef/ OR /^\s*endif/)
returns 44.5k files in more than 200 repositories.
The following files have been included as real-world usage samples
#2299C4