-
-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make module system more robust #139
Comments
@jeaye, I am on this. |
I think there might be a bit more to OTOH I seem to remember being surprised that side effects happened in a transaction. |
There's not much else happening in the transactions that mutate So it can very well be just an atom. If |
The While (and if) this is true, clearly this is an unsupported scenario since loading is not thread-safe and hence would never be retried. But in the spirit of Clojure parity, it's worth pondering. I don't think the claim in this issue is correct that "Clojure will still mark a namespace as loaded if you throw an exception while requiring it". But I also think Clojure parity can be achieved by changing Maybe the scenario that was observed was one where an
It sounds like you're suggesting calling (defonce ^:dynamic *loaded-libs* (atom (sorted-set)))
(defn- load-one
[lib need-ns require]
(load (root-resource lib))
(throw-if (and need-ns (not (find-ns lib)))
"namespace '%s' not found after loading '%s'"
lib (root-resource lib))
(when require
(swap! *loaded-libs* conj lib)))
(defn- load-all
"Loads a lib given its name and forces a load of any libs it directly or
indirectly loads. If need-ns, ensures that the associated namespace
exists after loading. If require, records the load so any duplicate loads
can be skipped."
[lib need-ns require]
(swap! *loaded-libs* #(reduce1 conj %1 %2)
(binding [*loaded-libs* (atom (sorted-set))]
(load-one lib need-ns require)
@*loaded-libs*)))
(defmacro ns
(let [...]
`(do
...
(if (.equals '~name 'clojure.core)
nil
(do (swap! @#'*loaded-libs* conj '~name) nil))))) |
That's right.
I don't quite follow this. Can you elaborate? The following experiment leads to the claim ;; a.clj
;; (ns a (:require b))
;; b.clj
;; (ns b)
;; (throw (Exception.))
➜ test clj -Sdeps '{:paths ["src"] :deps {com.clojure-goes-fast/clj-java-decompiler {:mvn/version "0.3.6"}}}'
Clojure 1.12.0
user=> (require 'a)
Execution error at b/eval152 (b.clj:3).
null
user=> (in-ns 'clojure.core)
#object[clojure.lang.Namespace 0x982bb90 "clojure.core"]
clojure.core=> @*loaded-libs*
#{b clojure.core.protocols clojure.core.server clojure.core.specs.alpha clojure.edn clojure.instant clojure.java.basis clojure.java.basis.impl clojure.java.browse clojure.java.io clojure.java.javadoc clojure.java.process clojure.java.shell clojure.main clojure.pprint clojure.repl clojure.repl.deps clojure.spec.alpha clojure.spec.gen.alpha clojure.string clojure.tools.deps.interop clojure.uuid clojure.walk}
clojure.core=> The idea is that while The example implementation you mentioned, is exactly how it's currently implemented in Jank. |
My impression is that it's undefined behavior to load files in parallel (per
Thanks. I realized it was $ clj -Sdeps '{:paths ["src"]}'
Clojure 1.12.0
user=> (require 'a :reload-all)
Execution error at b/eval152 (b.clj:2).
null
user=> @@#'clojure.core/*loaded-libs*
#{clojure.core.protocols clojure.core.server clojure.edn clojure.instant clojure.java.basis clojure.java.basis.impl clojure.java.browse clojure.java.io clojure.java.javadoc clojure.java.process clojure.java.shell clojure.main clojure.pprint clojure.repl clojure.repl.deps clojure.spec.alpha clojure.spec.gen.alpha clojure.string clojure.tools.deps.interop clojure.uuid clojure.walk} I think that's achieved by dynamic binding so jank would probably also inherit these semantics.
Whoops, I didn't see any commits referencing this issue. |
More details to be filled in soon.
JIT
Add dynamic var for loaded libs
Clojure uses a
clojure.core/*loaded-libs*
dynamic var, which contains a set of loaded libs. It's initialized inclojure.core
, like so:We can do the same with jank but make it an atom and leave a TODO for
ref
usage. Then follow all of the usages of*loaded-libs*
inclojure.core
and make sure jank is doing the same work in the various functions.Add cyclical dep check
Clojure also uses another dynamic var,
*pending-paths*
, to track cyclical deps. It's defined like so:We'll want to add the same thing, along with the
check-cyclic-dependency
function and its usages. Manually test this to ensure it's working well.Add transactionality
Based on my testing, Clojure will still mark a namespace as loaded if you throw an exception while requiring it. This means we don't need anything clever other than thread safety when updating the loaded libs set. Please do your own testing (check loaded libs, try to load a ns which isn't loaded, have it fail, and then check again) and ensure jank matches Clojure's behavior. The loaded libs var is private, but you can get to it like so:
Again, an atom will do fine here, as far as I can tell. I think it'd be good to ask about why a ref is used here, compared to an atom, in the Clojurian Slack; we might learn something neat.
Add ability to look up module by source
I have recently added an
origin
enum toload_module
. It allows us to explicitly load from source or load from the latest, where latest is either binary or source depending on timestamps (timestamp checking not yet implemented).For future functionality, it would be helpful to extract some of the behavior we have in
load_module
into afind_module
function which also takes anorigin
. It can then return some data containing both the entry and which part of the entry should be used (based on theorigin
).Add reloading support
When a pass
:reload
along to require, we want to load the module from the latest origin again, even if it's already loaded. This will require passing in a flag to ignore the early exit for skipping modules which are already loaded. This also needs to work with:reload-all
, which reloads the specified module and all of its dependencies from their latest origins. Look into how Clojure does this in theload-lib
function.AOT
Ensure module dependencies are compiled and loaded properly
We can AOT compile a module to an object file right now. If that module requires other modules, we need to fork off and compile each of them into their own object files. This has not been tested and may not be working. Start simply, with
a.jank
andb.jank
, wherea
requiresb
. Then tell jank to compilea
withjank compile a
(ensure thata.jank
is on the module path).Firstly, we want to be sure that both
a
andb
get generated separately, with the correct LLVM IR modules. You can comment out the print inruntime::context::write_module
to see the IR that's getting written for each module.Secondly, we want to be sure that when we load
a
,b
gets loaded as well. You can verify this by putting aprintln
at the top of both of them, starting a jank repl, and requiringa
. You should see both prints.Load binaries only if the source isn't newer
When loading a module and the
origin
islatest
, we can consider binaries for loading. If a binary is present, we need to check its timestamp against the source file. If the source file is missing, we need to not load the binary, since we always require source distributions. If the binary is newer or at least as new as the source, we can load the binary. If the source is newer, we need to load the source. For timestamps, we want to check the last modified time.Skip module compilation based on timestamp
When compiling, if a module has a binary which has a sufficient timestamp, we can skip compilation of the source.
Update binary cache path based on compilation flags
Right now, our binary cache path is based on a few different values. This is in
binary_version
, indir.cpp
. We'll need to parameterizebinary_version
to take in more inputs. Here's a list of those I can think of right now:This will mean that changing any of the above will result in a new binary cache dir which will then result in a recompile of every source (unless that cache dir has up-to-date binaries in it already).
Prevent duplicate symbols from being generated
For this, we need a few things:
foo_456_0
becomesclojure_core_foo_456_0
runtime::munge
module_to_load_function
for how to replace.
with_
runtime::context::unique_{string,symbol}
to use the current nsjank_set_module_symbol_counter(ns_name, count)
to the C APImodule_to_load_function
increate_function
and add a call in thereAdd .cpp module AOT compilation
If a module is backed by a
.cpp
file, we still want to be able to AOT compile it to a.o
file. For example, if we havesrc/foo_native.cpp
, we should be able tojank compile foo-native
. In order to do this, we'll probably need to invoke Clang, make sure it has the right flags (includes, defines, etc), and tell it to compile that.cpp
file to the target.o
file, while handling any failures.HOWEVER, before digging into this, I would ask in LLVM's
#jit
to see if there's a way to just get the IR from C++ that we're JIT compiling. That would allow us to not compile it a different way, since we can just use our normal JIT stuff and then extract the IR module and save it to an object file.The text was updated successfully, but these errors were encountered: