Julia Subtyping

Currently, this project contains empirical evaluation to support a restriction on Julia types to provide for decidable subtyping for the Julia language.

Thus, we need to analyze type annotations used in Julia programs to see if they satisfy a wildcards-like restriction (there is also an analysis of scoping, lower bounds, lower+upper bounds).

Note. Some annotations that do not literally correspond to the restriction from JB's thesis proposal/paper on decidable subtyping are not reported. In particular, cases like Tuple{Ref{T}} where T are not reported because they are trivially equivalent to Tuple{Ref{T} where T}.

Analysis results for the May 2023 corpus used for the thesis/paper can be found here.

Static analysis of types

Type annotations

Usually, type annotations appear after :: in the code:

as a part of the method signature (either as an argument or return type annotation),
as a type assertion in the method body.

When methods are generic, there can be an extra where sequence in the method signature, outside of the argument list.

Examples:

foo(x :: Int) :: Bool
foo(x :: Vector{T} where T)
foo(x :: T, xs :: Vector{T}) where T
x :: Int

We collect method type signatures and all other type annotations to the right of ::, which includes type assertions and types of fields.

Note. MacroTools.jl package has a handy function isdef to check for function definitions, but it seems to always return true. ~~We can use that to process method signatures.~~

longdef turns all functions into long forms, including nested expressions
splitdef conveniently processes any function definition form (short, long, anonymous) except for the do-notation

But we also want to collect information from nested function definitions and stand-alone type assertions. This is done manually with @capture.

Type declarations

We collect all user-defined type declarations and record the declaration itself and its declared supertype. Then, we check whether they satisfy use-site variance when treated as complete types. For example, Foo{X, Y<:Ref{X}} is analyzed as Foo{X, Y} where Y<:Ref{X} where X. Decidable subtyping doesn't require type declarations to satisfy use-site variance, but it is a nice indicator of the complexity.

Repository Organization

``
README.md this file
init-script.jl to install dependencies into the global Julia environment
types-extract.jl script for extracting type annotations from source code of packages
types-analyze.jl script for analyzing extracted type annotations
run-tests.jl convenience script for running the tests ($ julia run-tests.jl or $ ./run-tests.jl)
src source code
- JuliaSub.jl main module
- lb-analysis analysis of lower bounds
  - lib.jl main file combining everything related to the analysis
  - data.jl data types used for the analysis
  - process-code.jl extraction and counting lower bounds in Julia expressions
  - process-text.jl textual and parse-based analysis of lower bounds in text
  - process-pkgs.jl lower-bounds analysis of files, packages, and folders with packages
- types-analysis analysis of type annotations
  - lib.jl main file combining everything related to the analysis
  - data.jl data types used for the analysis
  - types-extract.jl extraction of type annotations
  - typedecls-extract.jl extraction of type declarations
  - type-analyze.jl analysis of type annotations
  - typedecl-analyze.jl analysis of type declarations
  - pkg-process.jl processing of packages: extraction of type annotations and declarations into a CSV, an analysis of type annotations and declarations read from a CSV and saving interesting results into another CSV
- utils auxiliary
  - lib.jl main file combining all utilities
  - equality.jl generic definition of structural equality
  - file-system.jl file system helpers
  - multiset.jl multiset merging via adding frequencies (instead of default max)
  - parsing.jl helpers for parsing Julia files
  - status-info.jl custom logging
lb-analysis.jl script that performs a complete analysis of lower bounds in the given folder with Julia packages
Project.toml dependencies

Dependencies

Julia with the following packages:
- MacroTools for working with Julia AST
  Note. Another package that could have been useful is Match
- Multisets for counting frequencies of lower bounds
- DataStructures for linked lists, to efficiently collect annotations
- CSV.jl
- DataFrames.jl
- Distributed.jl
- ArgParse
JuliaPkgsList.jl
JuliaPkgDownloader.jl

Getting packages data

Assumes ../utils/JuliaPkgsList.jl and ../utils/JuliaPkgDownloader.jl.

For both packages, run init-script.jl first.
JuliaPkgsList.jl should be "patched" with an empty data/excluded.txt file to make it easier to track which entries are invalid (since the file is outdated now anyway).

Note. Sometimes because of network issues, some packages are not downloaded. If in the case of all packages, the number of failed packages is > 50, run downloading again. Several dozen packages will remain broken for other reasons.

$ ../utils/JuliaPkgsList.jl/gen-pkgs-list.jl 100 -p data/julia-pkgs-info.json --name --includeversion --includeuuid -o data/pkgs-list/top-pkgs-list.txt

$ ../utils/JuliaPkgsList.jl/gen-pkgs-list.jl 0 -p data/julia-pkgs-info.json --name --includeversion --includeuuid -o data/pkgs-list/all-pkgs-list.txt

$ julia -p 32 ../utils/JuliaPkgDownloader.jl/download-pkgs.jl -s data/pkgs-list/100-top-pkgs-list.txt -d data/100

$ julia -p 32 ../utils/JuliaPkgDownloader.jl/download-pkgs.jl -s data/pkgs-list/all-pkgs-list.txt -d data/all

Running type annotations analysis

Run init-script.jl first.

Note. Output and error streams are redirected to a file. To print to the terminal, remove > data...

To extract type annotations:

$ julia -p 32 types-extract.jl data/100 data/ta-info/100 > data/ta-info/log-extract-100.txt 2>&1

$ julia -p 32 types-extract.jl data/all data/ta-info/all > data/ta-info/log-extract-all.txt 2>&1

To analyze type annotations:

$ julia -p 32 types-analyze.jl data/ta-info/100 > data/ta-info/log-analysis-100.txt 2>&1

$ julia -p 32 types-analyze.jl data/ta-info/all > data/ta-info/log-analysis-all.txt 2>&1

Adding more analyses

To extend the output CSV of the analysis and have a new CSV with types of interest:

In src/types-analysis/pkg-process.jl:
- extend ANALYSIS_COLS_ANNS_NOERR
- in analyzePkgTypeAnns,
  - extend failedResult
  - extend dfta
  - add a df* var and extend the for-loop right after
  - extend the resulting Dict
- in getTypeAnnsAnalyses, extend map in varsAnalyses
- extend ANALYSIS_COLS_DECLS
- in analyzePkgTypeDecls,
  - extend failedResult
  - extend dftd
  - add a df* var and extend the for-loop right after
  - extend the resulting Dict
- in addTypeDeclsAnalysis!, extend newCols
- in analyzeTypeDecl, extend the resulting array and increment in fill
- in analyzePkgTypesAndSave2CSV,
  - extend both combineVCat!
  - add a CSV.write
In tests, make sure to add isfile in pkg-process.jl for new CSV files. Furthermore, manually check that necessary annotations/declarations are reported, as it is easy to make mistakes when copying stuff in dataframe-related code...

Old README from 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Julia Subtyping

Static analysis of types

Type annotations

Type declarations

Repository Organization

Dependencies

Getting packages data

Running type annotations analysis

Adding more analyses

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.github/workflows		.github/workflows
data		data
notes		notes
pkg-run-utils		pkg-run-utils
reports		reports
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Manifest-ref.toml		Manifest-ref.toml
Project.toml		Project.toml
README.md		README.md
init-script.jl		init-script.jl
lb-analysis.jl		lb-analysis.jl
run-tests.jl		run-tests.jl
types-analyze.jl		types-analyze.jl
types-extract.jl		types-extract.jl

License

prl-julia/julia-sub

Folders and files

Latest commit

History

Repository files navigation

Julia Subtyping

Static analysis of types

Type annotations

Type declarations

Repository Organization

Dependencies

Getting packages data

Running type annotations analysis

Adding more analyses

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages