Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustc: Librarify marker_rustc_driver #271

Merged
merged 7 commits into from
Oct 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions marker_rustc_driver/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ license = { workspace = true }
repository = { workspace = true }
version = { workspace = true }

[lib]
doctest = false

[dependencies]
marker_adapter = { workspace = true }
marker_api = { workspace = true, features = ["driver-api"] }
Expand Down
388 changes: 388 additions & 0 deletions marker_rustc_driver/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,388 @@
#![doc = include_str!("../README.md")]
#![feature(rustc_private)]
#![feature(let_chains)]
#![feature(lint_reasons)]
#![feature(iter_collect_into)]
#![feature(non_exhaustive_omitted_patterns_lint)]
#![feature(once_cell_try)]
#![warn(rustc::internal)]
#![warn(clippy::pedantic)]
#![warn(non_exhaustive_omitted_patterns)]
#![allow(clippy::missing_panics_doc)]
#![allow(clippy::module_name_repetitions)]
#![allow(clippy::wildcard_imports)]
#![allow(clippy::needless_lifetimes, reason = "lifetimes will be required to fix ICEs")]
#![allow(clippy::needless_collect, reason = "false positives for `alloc_slice`")]
#![allow(clippy::too_many_lines, reason = "long functions are unavoidable for matches")]
#![allow(rustdoc::private_intra_doc_links)]

extern crate rustc_ast;
extern crate rustc_data_structures;
extern crate rustc_driver;
extern crate rustc_errors;
extern crate rustc_hash;
extern crate rustc_hir;
extern crate rustc_hir_analysis;
extern crate rustc_interface;
extern crate rustc_lint;
extern crate rustc_lint_defs;
extern crate rustc_middle;
extern crate rustc_session;
extern crate rustc_span;
extern crate rustc_target;

pub mod context;
pub mod conversion;
pub mod lint_pass;

use std::env;
use std::ops::Deref;
use std::process::Command;

use camino::{Utf8Path, Utf8PathBuf};
use marker_adapter::{LintCrateInfo, LINT_CRATES_ENV};
use marker_error::Context;

use crate::conversion::rustc::RustcConverter;

// region replace rust toolchain dev
const RUSTC_TOOLCHAIN_VERSION: &str = "nightly-2023-10-05";
// endregion replace rust toolchain dev

struct DefaultCallbacks {
env_vars: Vec<(&'static str, String)>,
}
impl rustc_driver::Callbacks for DefaultCallbacks {
fn config(&mut self, config: &mut rustc_interface::Config) {
let env_vars = std::mem::take(&mut self.env_vars);
config.parse_sess_created = Some(Box::new(move |sess| {
register_tracked_env(sess, &env_vars);
}));
}
}

struct MarkerCallback {
env_vars: Vec<(&'static str, String)>,
lint_crates: Vec<LintCrateInfo>,
}

impl rustc_driver::Callbacks for MarkerCallback {
fn config(&mut self, config: &mut rustc_interface::Config) {
let env_vars = std::mem::take(&mut self.env_vars);
let lint_crates = self.lint_crates.clone();
config.parse_sess_created = Some(Box::new(move |sess| {
register_tracked_env(sess, &env_vars);
register_tracked_files(sess, &lint_crates);
}));

// Clippy explicitly calls any previous `register_lints` functions. This
// will not be done here to keep it simple and to ensure that only known
// code is executed.
assert!(config.register_lints.is_none());
let lint_crates = std::mem::take(&mut self.lint_crates);

config.register_lints = Some(Box::new(move |_sess, lint_store| {
// It looks like it can happen, that the `config` function is called
// with a different thread than the actual lint pass later, how interesting.
// This will not make sure that the adapter is always initiated.
if let Err(err) = lint_pass::RustcLintPass::init_adapter(&lint_crates) {
err.print();
// FIXME: we need to figure out the way to run the initialization code
// earlier where we can cleanly report an error and exit the process.
std::process::exit(1);
}
// Register lints from lint crates. This is required to have rustc track
// the lint level correctly.
let lints: Vec<_> = lint_pass::RustcLintPass::marker_lints()
.into_iter()
.map(RustcConverter::static_to_lint)
.collect();

lint_store.register_lints(&lints);

lint_store.register_late_pass(|_| Box::new(lint_pass::RustcLintPass));
}));
}
}

fn register_tracked_env(sess: &mut rustc_session::parse::ParseSess, vars: &[(&'static str, String)]) {
use rustc_span::Symbol;
let env = sess.env_depinfo.get_mut();

for (key, value) in vars {
env.insert((Symbol::intern(key), Some(Symbol::intern(value))));
}
}

fn register_tracked_files(sess: &mut rustc_session::parse::ParseSess, lint_crates: &[LintCrateInfo]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rustc uses caching where possible. This function, and the function above, register relevant files and environment values which should be checked, when rustc decides if a cache can be reused or not.

This one mainly registers the Cargo.toml file, as that's where Marker's lints are configured and the lint crate files. It basically just ensures that we don't use caches if any lint crate was updated. Do you maybe want to make this one public as well? I'm not sure how dylint handles this kind of thing.

This is just a suggestion, if you handle this differently or don't need it, I'm happy to merge this as is :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you maybe want to make this one public as well?

It's kind of you to ask, but I don't think it's necessary.

If "marker_rustc_driver as a library" feels "incomplete" to you without this function, I can sympathize with that feeling. But I would recommend to keep the interface as small as possible until me or someone else complains. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to mention it, since i[#294]t was a thing we ran into after some testing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious now (because caching has been the source of several bugs in Dylint). Is there a direct connection between caching and #294?

Copy link
Member

@xFrednet xFrednet Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference to #294 seems to be an accidental copy-pasta 🍝. Sorry, about that.

There is a direct connection between the function I marked here and caching. This is the PR that prevented unwanted caching for Marker: #211

Do you have some examples for caching related bugs in Dylint and how you solved them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use rustc_span::Symbol;

let files = sess.file_depinfo.get_mut();

// Cargo sets the current directory, to the folder containing the `Cargo.toml`
// file of the compiled crate. Therefore, we can use the relative path.
let cargo_file = Utf8Path::new("./Cargo.toml");
if cargo_file.exists() {
files.insert(Symbol::intern("./Cargo.toml"));
}

// At this point, the lint crate paths should be absolute. Therefore, we
// can just throw it in the `files` set.
for lint_crate in lint_crates {
files.insert(Symbol::intern(lint_crate.path.as_str()));
}

// Track the driver executable in debug builds
#[cfg(debug_assertions)]
match env::current_exe().as_ref().map(|path| path.to_str()) {
Ok(Some(current_exe)) => {
files.insert(Symbol::intern(current_exe));
},
Ok(None) => {
// The path is not valid UTF-8. We can simply ignore this case
},
Err(e) => eprintln!("getting the path of the current executable failed {e:#?}"),
}
}

/// If a command-line option matches `find_arg`, then apply the predicate `pred` on its value. If
/// `true`, then return it. The parameter is assumed to be either `--arg=value` or `--arg value`.
fn arg_value<'a, T: Deref<Target = str>>(
args: &'a [T],
find_arg: &str,
pred: impl Fn(&str) -> bool,
) -> Option<&'a str> {
let mut args = args.iter().map(Deref::deref);
while let Some(arg) = args.next() {
let mut arg = arg.splitn(2, '=');
if arg.next() != Some(find_arg) {
continue;
}

match arg.next().or_else(|| args.next()) {
Some(v) if pred(v) => return Some(v),
_ => {},
}
}
None
}

fn toolchain_path(home: Option<String>, toolchain: Option<String>) -> Option<Utf8PathBuf> {
home.and_then(|home| {
toolchain.map(|toolchain| {
let mut path = Utf8PathBuf::from(home);
path.push("toolchains");
path.push(toolchain);
path
})
})
}

fn display_help() {
println!(
"\
Marker's rustc driver to run your lint crates.

Usage:
marker_rustc_driver [options] [--] [<opts>...]

Common options:
-h, --help Print this message
--rustc Pass all arguments to rustc
-V, --version Print version information and exit
--toolchain Print the required toolchain and API version

---

This message belongs to a specific marker driver, if possible you should avoid
interfacing with the driver directly and use `cargo marker` instead.
"
);
}

#[allow(clippy::missing_errors_doc)]
pub fn try_main(args: impl Iterator<Item = String>) -> Result<(), MainError> {
// Note: This driver has two different kinds of "arguments".
// 1. Normal arguments, passed directly to the binary. (Collected below)
// 2. Arguments provided to `cargo-marker` which are forwarded to this process as environment
// values. These are usually driver-independent and handled by the adapter.
let mut orig_args: Vec<String> = args.collect();

let sys_root_arg = arg_value(&orig_args, "--sysroot", |_| true);
let has_sys_root_arg = sys_root_arg.is_some();

// Further invocation of rustc require the `--sysroot` flag. We add it here
// in preparation.
if !has_sys_root_arg {
let sys_root = find_sys_root(sys_root_arg);
orig_args.extend(["--sysroot".into(), sys_root]);
};

// make "marker_rustc_driver --rustc" work like a subcommand that passes
// all args to "rustc" for example `marker_rustc_driver --rustc --version`
// will print the rustc version that is used
if let Some(pos) = orig_args.iter().position(|arg| arg == "--rustc") {
orig_args.remove(pos);
orig_args[0] = "rustc".to_string();

rustc_driver::RunCompiler::new(&orig_args, &mut DefaultCallbacks { env_vars: vec![] }).run()?;

return Ok(());
}

if orig_args.iter().any(|a| a == "--version" || a == "-V") {
let version_info = rustc_tools_util::get_version_info!();
println!("{version_info}");
return Ok(());
}

if orig_args.iter().any(|a| a == "--toolchain") {
println!("toolchain: {RUSTC_TOOLCHAIN_VERSION}");
println!("driver: {}", env!("CARGO_PKG_VERSION"));
println!("marker-api: {}", marker_api::MARKER_API_VERSION);

return Ok(());
}

// Setting RUSTC_WRAPPER causes Cargo to pass 'rustc' as the first argument.
// We're invoking the compiler programmatically, so we'll ignore this.
let wrapper_mode = orig_args.get(1).map(Utf8Path::new).and_then(Utf8Path::file_stem) == Some("rustc");

if wrapper_mode {
// we still want to be able to invoke rustc normally
orig_args.remove(1);
}

// The `--rustc` argument has already been handled, therefore we can just
// check for this argument without caring about the position.
if !wrapper_mode && orig_args.iter().any(|a| a == "--help" || a == "-h") {
display_help();
return Ok(());
}

// We enable Marker if one of the following conditions is met
// - IF Marker is run on the main crate, not on deps (`!cap_lints_allow`) THEN
// - IF `--no-deps` is not set (`!no_deps`) OR
// - IF `--no-deps` is set and Marker is run on the specified primary package
let cap_lints_allow = arg_value(&orig_args, "--cap-lints", |val| val == "allow").is_some()
&& arg_value(&orig_args, "--force-warn", |_| true).is_none();
let no_deps = orig_args.iter().any(|arg| arg == "--no-deps");
let in_primary_package = env::var("CARGO_PRIMARY_PACKAGE").is_ok();

let enable_marker = !cap_lints_allow && (!no_deps || in_primary_package);
let env_vars = vec![(LINT_CRATES_ENV, std::env::var(LINT_CRATES_ENV).unwrap_or_default())];
if !enable_marker {
rustc_driver::RunCompiler::new(&orig_args, &mut DefaultCallbacks { env_vars }).run()?;
return Ok(());
}

let lint_crates = LintCrateInfo::list_from_env()
.context(|| "Error while determining the lint crates to load")?
.unwrap_or_default();

let additional_args = [
// Make it possible to use `#[allow(marker::{lint_name})]` without
// having to add `#![feature(register_tool)]` and `#![register_tool(marker)]`.
"-Zcrate-attr=feature(register_tool)",
"-Zcrate-attr=register_tool(marker)",
// We need to provide a marker cfg flag to allow conditional compilation,
// we add a simple `marker` config for the common use case, but also provide
// `marker=crate_name` for more complex uses
"--cfg=marker",
]
.into_iter()
.map(str::to_string)
.chain(
lint_crates
.iter()
.map(|krate| format!("--cfg=marker=\"{}\"", krate.name)),
);

orig_args.extend(additional_args);

let mut callback = MarkerCallback { env_vars, lint_crates };
rustc_driver::RunCompiler::new(&orig_args, &mut callback).run()?;

Ok(())
}

/// Get the sysroot, looking from most specific to this invocation to the least:
/// - command line
/// - runtime environment
/// - `SYSROOT`
/// - `RUSTUP_HOME`, `MULTIRUST_HOME`, `RUSTUP_TOOLCHAIN`, `MULTIRUST_TOOLCHAIN`
/// - sysroot from rustc in the path
/// - compile-time environment
/// - `SYSROOT`
/// - `RUSTUP_HOME`, `MULTIRUST_HOME`, `RUSTUP_TOOLCHAIN`, `MULTIRUST_TOOLCHAIN`
fn find_sys_root(sys_root_arg: Option<&str>) -> String {
sys_root_arg
.map(Utf8PathBuf::from)
.or_else(|| std::env::var("SYSROOT").ok().map(Utf8PathBuf::from))
.or_else(|| {
let home = std::env::var("RUSTUP_HOME")
.or_else(|_| std::env::var("MULTIRUST_HOME"))
.ok();
let toolchain = std::env::var("RUSTUP_TOOLCHAIN")
.or_else(|_| std::env::var("MULTIRUST_TOOLCHAIN"))
.ok();
toolchain_path(home, toolchain)
})
.or_else(|| {
Command::new("rustc")
.arg("--print")
.arg("sysroot")
.output()
.ok()
.and_then(|out| String::from_utf8(out.stdout).ok())
.map(|s| Utf8PathBuf::from(s.trim()))
})
.or_else(|| option_env!("SYSROOT").map(Utf8PathBuf::from))
.or_else(|| {
let home = option_env!("RUSTUP_HOME")
.or(option_env!("MULTIRUST_HOME"))
.map(ToString::to_string);
let toolchain = option_env!("RUSTUP_TOOLCHAIN")
.or(option_env!("MULTIRUST_TOOLCHAIN"))
.map(ToString::to_string);
toolchain_path(home, toolchain)
})
.map(Utf8PathBuf::into_string)
.expect("need to specify SYSROOT env var during marker compilation, or use rustup or multirust")
}

pub enum MainError {
Custom(marker_error::Error),
Rustc(rustc_span::ErrorGuaranteed),
}

impl From<marker_error::Error> for MainError {
fn from(err: marker_error::Error) -> Self {
Self::Custom(err)
}
}

impl From<rustc_span::ErrorGuaranteed> for MainError {
fn from(err: rustc_span::ErrorGuaranteed) -> Self {
Self::Rustc(err)
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_arg_value() {
let args = &["--bar=bar", "--foobar", "123", "--foo"];

assert_eq!(arg_value(&[] as &[&str], "--foobar", |_| true), None);
assert_eq!(arg_value(args, "--bar", |_| false), None);
assert_eq!(arg_value(args, "--bar", |_| true), Some("bar"));
assert_eq!(arg_value(args, "--bar", |p| p == "bar"), Some("bar"));
assert_eq!(arg_value(args, "--bar", |p| p == "foo"), None);
assert_eq!(arg_value(args, "--foobar", |p| p == "foo"), None);
assert_eq!(arg_value(args, "--foobar", |p| p == "123"), Some("123"));
assert_eq!(arg_value(args, "--foo", |_| true), None);
}
}
Loading