Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify build systems #233

Open
zhaih opened this issue Oct 10, 2023 · 14 comments
Open

Unify build systems #233

zhaih opened this issue Oct 10, 2023 · 14 comments

Comments

@zhaih
Copy link
Collaborator

zhaih commented Oct 10, 2023

Right now in normal luceneutil run (using local_run.py) we're using "Mike build" where each file is hardcoded and linked with your local lucene checkout.
And when we run knn_perf_test.py we have the local variable to specify the lucene checkout and we use ant build to build the vector files as well as KnnGraphTester.java
Plus we have to specify lucene checkout in build.xml (for ant build), and in gradle.properties (for code hint in Intellij), there's just too many different build system and too many places to specify your lucene checkout.

Maybe it's time to unify to one build system and centralize a place for configs?

@msokolov
Copy link
Collaborator

+1 to have fewer build systems. I'm not familiar with the gradle.properties though - is there a gradle build in luceneutil also?

@zhaih
Copy link
Collaborator Author

zhaih commented Oct 12, 2023 via email

@mikemccand
Copy link
Owner

I love the idea of a unified build system, but in my experience it remains an unachievable ideal -- I wind up hating nearly every build system I use ;) It's sort of like modern day "security" -- a necessarily evil that barely stops bad actors and slows down legitimate users (there are exceptions, e.g. Tesla cars' security is almost great).

Build systems become a barrier to progress making it hard to do simple things. Here is an example from Lucene's switch to Gradle, where merely detecting that you likely had a typo in the one test case you tried to run from the command line and so no tests ran and failing, required god-like mountain moving development to enable. Gradle in particular is also picky about specific JVM versions, which is not great since we use luceneutil to assess whether a brand new JVM impacts our performance. It's also slow in Lucene's case.

This proejct has some tricky dev requirements for a build tool. It's not just building the local java source in luceneutil, but also possibly one or two Lucene checkouts, possibly two luceneutil checkouts (when the versions of Lucene you are testing hav API differences) and possibly with different JVM versions or command-line options (when comparing performance across JVMs). If the build will also run each benchy, that is also gets tricky (one or two indices, will you run with facets or not, which tasks, which corpus, etc.). It should work well from IDEs and from the command-line.

When one wants to make improvements to these benchmarks, one must also become knowledgeable in whatever build system we pick, raising the (already high!) required bar for contributions.

These are the reasons why we still have a crappy "Mike build" here -- it's entirely Python (the language for expressing the test case already) so already approachable by Pythonistas. I would expect dev effort against luceneutil to be 95% benchmarking at at most 5% struggling with build tools.

Is there a nice Python based build system yet :)

All this being said, I would be happy to try yet another build system, as long as it does not become too much of a barrier to actual progress in benchmarking. It should be nearly invisible to devs, like a Tesla car's security.

</rant>

@nitirajrathore
Copy link
Contributor

Amazingly, I was also struglling with same issues in the build and was about to open an issue today for the same when I checked this issue already open last week. And so I have following proposal. I will submit a PR for the same.

  1. Use the gradle as a single build system. Reasons being
    1. Gradle is the build system for Apache lucene as well. So people contributing to Apache lucene anyway will have to learn that system sometime in their journey.
    2. Gradle has good documentation and becoming quite popular in dev community. I also checked out this youtube video for my own understanding.
    3. Gradle seems to also have python port. I am not sure though how it work yet.
  2. Lot of uncompilable files which were either using external library not added to the project or were dependent on older version of lucene.
    1. Proposed Solution: Move all the files that are compilable with a particular version of lucene to a folder with that version name under src/lucene folder.
      1. We can add different commented source folders to the settings.gradle. So if someone really want to use those classes they can add sources of particular lucene version and update the settings.gradle accordingly to build.
    2. Current fix : Moved all such classes to a single folder called src/lucene/legacy
  3. Updated some classes to compile with lucene 9.8 release (lucene-9.8 branch in Apache lucene)
  4. Back dependency on lucene version. A lot of code in luceneutils depends on the lucene version to work correctly. If something changes in Apache lucene it will break the comparison code between candidate and baseline. For example if I try to compare the current lucene main branch against the lucene-9.8 branch it gives error in the StringFieldDocSelector class where in lucene-9.8 expects the class to have getFilteredLiveDocs() method while lucene main expects it to have getFilteredDocs().
    1. Current Fix : I have implemented both the methods in StringFieldDocSelector with same implementation. But this can be wrong. But I was able to run the python src/python/localrun.py -source wikimedium10k -b <baseline> -c <candidate> successfully.
    2. Proposed solution : May be the comparison code can be in another git project and the java code dependent on lucene can be in different git project. The comparison runner can compile the 2 different branches of the indexer/searcher to generate results and compare. For example, say we break current package of util to util and benchmarker in their own git repos. Now benchmarker will contain the python code like compititions.py which are basically the benchmark runner and util will have all the lucene dependent Indexer / Searcher etc code. If anything is incompatible in the baseline and candidate of Apache lucene project then developer can create separate baseline and candidate branches in util project in their local to create the indexes and run the queries as expected and then emit the results in a common place outside util project which can be then compared by the python code in benchmarker project.

@nitirajrathore
Copy link
Contributor

Another 2 interesting aspects of Gradle are

  1. Its build files are actually written down in a programming language, like we can choose Groovy or Kotlin which are both jvm based languages. Developers are not limited by the functionality of a particular data format like xml for expressing their needs. I think the trend is towards using Kotlin based builds now for Gradle. So yes there are changes that keep happening in the Gradle world, but may be they are for good.
  2. I was transforming the ant builds in this package to gradle format like below, which is great as the format is more based on programming code rather than expressing in data language (so more understandable. Also, the build scripts themselves are debuggable in IDE, just in case we need to.). But then I discovered that even before I make the complete move, Gradle out of the box support including ant targets as gradle tasks.

So it became a one liner change (ant.importBuild 'build.xml') to make it work till I figure out copy filter changes that are required. With this change I ran ./gradlew vectors100-docs and it worked perfectly fine.

Ant to Gradle

Ant

  <target name="vectors300-docs" depends="build">
    <java className="WikiVectors" classpathref="build.classpath">
      <arg value="${data.dir}/glove.6B.300d.txt"/>
      <arg value="${data.dir}/enwiki-20120502-lines-1k-fixed-utf8-with-random-label.txt"/>
      <arg value="${data.dir}/enwiki-20120502-lines-1k-300d.vec"/>
    </java>
  </target>

Gradle

task vectors300Docs(type:JavaExec) {
  main = "WikiVectors"
  description = "Generate 300 dimension 32 bit (float) document vectors for the wiki lines"
  classpath = sourceSets.main.runtimeClasspath
  args = [dataDir + "/glove.6B.300d.txt", dataDir + "/enwiki-20120502-lines-1k-fixed-utf8-with-random-label.txt",
          dataDir + "/enwiki-20120502-lines-1k-300d.vec"]
  dependsOn = [ build ]
}

@msokolov
Copy link
Collaborator

Personally I don't care all that much what the build system is. I do like to keep things as simple as possible, so I kind of like declarative build configuration rather than a build controlled by a programming language. Still missing Makefiles I guess. But mostly I would just like whatever system we have to be maintained and easy to use. Easy to update is a second priority, but also important. For this change I suggest testing it by running the nightlyBenchmarks.py - I think the move of classes to a legacy folder might break some of that?

@nitirajrathore
Copy link
Contributor

Build systems become a barrier to progress making it hard to do simple things
I would expect dev effort against luceneutil to be 95% benchmarking at at most 5% struggling with build tools.

I agree the that build system should not come in the way of development. The problem is the current system is definitely taking more time and effort for building so any automated system seems better than current.

But mostly I would just like whatever system we have to be maintained and easy to use.

I think the learning curve in Gradle is definitely higher than than any of the other build system and it seems to be evoling fast, which again means we should expect frequent additions and changes to it. The languages used namely kotlin and grovvy are both something that we don't use in day to day development, so that is another hinderance for someone new to make any custom change with.

In the absence of any definitive winner, as per my understanding, and in need of some good build system, I will give Apache Ant + Ivy a shot and get some build system to work in the project which works fine with IDEs as well. Ant surely does not come in between anything else, so may be having a python build system along side Ant is more feasible as compared to Maven and Gradle.

@nitirajrathore
Copy link
Contributor

@mikemccand @msokolov @zhaih :
I have removed gradle and maven builds from the luceneutil and used ant and ivy to manage the project. java classes gets compiled properly and and can be executed using ant task (as mentioned below).
For python, I don't think we need build system as such, but I am not a python expert, I think we can just run all python scripts directly with python command. ( after install dependencies manually via pip ). I think this is what we are doing right now as well. Let me know if I am wrong.
I have pushed my changes to #234. Although, after rebasing its showing a lot of merged changes etc.

I tested that following commands and they worked fine.

for java:

ant run -Dmain.class=KnnGraphTester -Dargs="-docs ../data/enwiki-20120502-lines-1k-100d.vec -niter 1000 -beamWidthIndex 100 -maxConn 16 -niter 1000 -search tasks/vector-task-100d.vec -reindex"

for python:

python src/python/localrun.py -source wikimedium1m -b ~/mywork/lucene/lucene -c ~/mywork/lucene/lucene

I was not able to find the nightlyBenchmarks.py in the source code but there was one nightlyBench.py. But running it gives multiple errors. May be either I am not running it correctly or running wrong script.

I tried running python src/python/nightlyBench.py but it gave me many errors.
Firstly it was trying to find NIGHTLY_LOG_DIR in constants.py, which does not exists. So I defined it as a path to some local log folder.
But then it is trying to find a lucene/benchmarks/reports.nightly/indexing.html which also does not exits.
Am I running this correctly?

Let me know if you like this build system and what else should I test.

@nitirajrathore
Copy link
Contributor

I have created a new PR with Ant + Ivy build. Please have a look. #247

I am still not sure how to test the nightly benchmarks. Please help me with the right command to verify this. Rest everything seems to be working fine.
Please review.

@zhaih
Copy link
Collaborator Author

zhaih commented Dec 4, 2023 via email

@zhaih
Copy link
Collaborator Author

zhaih commented Dec 5, 2023

To @mikemccand 's comment

Gradle in particular is also picky about specific JVM versions, which is not great since we use luceneutil to assess whether a brand new JVM impacts our performance.

I think when we build the source we always expect to use the Lucene's JDK version, when we run it, we can use different jvms right? For luceneutil I think gradle can work until a jar is built, then in python/cmd we can use whatever JVM and add whatever JVM options we want to test?

This proejct has some tricky dev requirements for a build tool. It's not just building the local java source in luceneutil, but also possibly one or two Lucene checkouts

I think that's why it's really important we only have 1 place to specify a lucene checkout instead of 3 (which is current situation)

When one wants to make improvements to these benchmarks, one must also become knowledgeable in whatever build system we pick, raising the (already high!) required bar for contributions.

I agree, and I think that's why we need to do this, we need to work harder on those build system such that it is not visible for the newcomers (just like Lucene, there's tons of gradle scripts we run daily but we never notice them). Ideally all one should ever do is to invoke setup.py and that's it. All IDE and other stuff should be linked automatically. I'm not a fan of gradle (because of its speed and sometimes flaky issues), but that's the only one I know that works well with most of nowaday's tools.

To summarize, I think we should have one and only one build system that does:

  1. Read from a config file to get information about lucene checkout
  2. link and compile all the related code
  3. create a jar output
  4. Integrate with IDE such that all code suggestion feature will work

As for how we run the jar, we can still let python do that, or figure out a better way in the future.

@nitirajrathore
Copy link
Contributor

Ideally all one should ever do is to invoke setup.py and that's it.

Don't think so. Does not hurt to run ant compile before any python run.

Ant can efficiently compile and do a lot of setup task. Python can be used to run the python runnables which will use the compiled class located in the designated folder in java's class path whereever required.
Going forward I am planning to submit quite some java main classes which can execute a lot of tasks independent of python. We should not impose python build on those java main classes.

More importantly I think we should conclude on one strategy and make progress. If things do not work very well we can try other options.

@msokolov
Copy link
Collaborator

msokolov commented Dec 12, 2023 via email

@nitirajrathore
Copy link
Contributor

I think there are couple of things when talk about build system.

  1. Java classes need to be compiled
  2. Java dependencies needs to be satisfied. Like any external jars etc needs to be downloaded.
  3. Java Main classes need to be invoked for different tasks
  4. Python libraries needs to be installed. (usually done via pip I guess)
  5. Python scripts to be invoked.

1, 2 and 3 are done by Ant very well.

(4) is not handled currently automatically. But should be relatively easy to use a requirements.txt file inside the python package and we can run python3 -m pip install -r requirements.txt as described here inside the ant compile task itself any task before it.

(5) I would say that invoking python scripts should still be done by python command directly and we should not create target for it. But we can create some targets for regular activities like benchmarking etc which will list in ant -projecthelp for someone to easily use them correctly.

I definitely do not like java files being compiled inside the python script using javac command. If that is what we are calling python build system, I am surely in favor of removing it. I did not remove it in this PR just to avoid any breaking changes.

Please feel free approve PR if that looks good or propose changes. I would be happy to incorporate any change in the PR asap.

Configuring IDE is marked optional. I included that section as the earlier gradle build was introduced to make the IDE compilation happy. So just wanted to make sure that people who uses IDE can still continue to use it with Ant as well.
It is not a very difficult task to setup IDE, especially if the steps are written.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants