Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "syn-inline-mod"
version = "0.4.0"
authors = ["Ted Driggs <tdriggs@outlook.com>"]
authors = ["Ted Driggs <tdriggs@outlook.com>", "Rain <rain1@fb.com>"]
edition = "2018"
repository = "https://github.com/TedDriggs/syn-inline-mod"
documentation = "https://docs.rs/syn-inline-mod/0.4.0"
Expand All @@ -11,7 +11,9 @@ description = "Inlines modules in Rust source code for source analysis"

[dependencies]
syn = { version = "^1.0.0", features = ["full", "visit-mut"] }
os_str_bytes = "2.3.1"
proc-macro2 = { version = "^1.0.0", features = ["span-locations"] }

[dev-dependencies]
syn = { version = "^1.0.0", features = ["extra-traits"] }
quote = "^1.0.0"
3 changes: 3 additions & 0 deletions fixtures/example/foo.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#![inner_attr]

mod bar;
1 change: 1 addition & 0 deletions fixtures/example/foo/bar.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fn bar_fn() {}
4 changes: 4 additions & 0 deletions fixtures/example/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
//! An example of a Rust crate that can be inlined.

#[outer_attr]
mod foo;
137 changes: 131 additions & 6 deletions src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
//! Utility to traverse the file-system and inline modules that are declared as references to
//! other Rust files.

use os_str_bytes::OsStringBytes;
use proc_macro2::Span;
use std::{
error, fmt, io,
path::{Path, PathBuf},
};
use syn::spanned::Spanned;
use syn::ItemMod;
use syn::{Attribute, ItemMod, LitByteStr};

mod mod_path;
mod resolver;
Expand Down Expand Up @@ -38,6 +39,57 @@ pub fn parse_and_inline_modules(src_file: &Path) -> syn::File {
.output
}

/// The string `syn_inline_mod_path`. This acts as a marker for inlined paths.
pub(crate) static SYN_INLINE_MOD_PATH: &str = "syn_inline_mod_path";

/// A module path in a list of attributes, found by `find_mod_path`.
#[derive(Clone, Debug)]
pub struct InlineModPath<'ast> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called InlineModAttributes and have path: Option<PathBuf> as a field? Then the find_mod_path function would be changed to split_mod_attributes(attrs: &[Attribute]) -> InlineModAttributes {}.

/// The path this module resides in. Uses the same base (relative or absolute) as the
/// original path passed in.
pub path: PathBuf,

/// The outer attributes. These are of the form `#[attribute]` and are in the path that
/// imported this module.
pub outer_attributes: &'ast [Attribute],

/// The inner attributes. These are of the form `#![attribute]` and are defined in `path`.
///
/// The attributes listed in `inner_attributes` will retain their preceding `!` -- that is
/// technically not valid Rust but is done this way for simplicity.
pub inner_attributes: &'ast [Attribute],
}

/// Given a list of attributes from an inlined module, search for the first attribute of the form
/// `#[syn_inline_mod_path(b"...")]`. If it is found, then return a triple with:
/// * the module path
/// * the outer attributes
/// * the inner attributes
///
/// See the documentation for `InlinerBuilder::annotate_paths` for more information.
pub fn find_mod_path(attrs: &[Attribute]) -> Option<InlineModPath> {
let mut path = None;
let position = attrs.iter().position(|attr| {
if attr.path.is_ident(SYN_INLINE_MOD_PATH) {
// Ignore the attribute if it isn't of the correct form.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your worry here that some other proc-macro might happen to squat on our attribute name? Finding an attribute of the right name and wrong format feels to me like something must have gone horribly wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah or the user puts the annotation in themselves or something. The options are to panic or ignore -- what do you think about them? I guess panicking is probably better/safer since it'd be totally unexpected to hit this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I realized this may end up dealing with external data so we'd probably rather not panic. I think this should be a pretty rare case though, as you pointed out something's really gone wrong (but we don't want to create an opportunity for an accidental DoS).

Copy link
Owner

@TedDriggs TedDriggs Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DoS seems bad, but if you're dealing with untrusted input couldn't someone put that attribute into the original source code to try and make you follow a path into a directory you shouldn't?

I'm wondering if embedding these in-band is fundamentally unsafe.

I think this can be done safely, but we should explicitly disallow inlining if the input contains the magic attribute in the inner or outer attributes. That way we avoid attacks that attempt to feed us incorrect paths; those sorts of attack could be used to try and read arbitrary files from wherever this code was being run.

If we need to support the same code being inlined multiple times, then I see two options:

  1. We only fail inlining if the attribute is present and we've been asked to add path annotations. One upside of this approach is that we don't do work scanning attributes unless we're anticipating using the inserted annotations, while a downside of this approach is that someone could still sneak a bad path through because there's no connection between find_mod_path and InliningResult
  2. Rather than being a constant, the attribute path is a property of InlininerBuilder and later of InliningResult. That seems pretty ugly, however, especially since that would mean find_mod_path should take (r: &InliningResult, attributes: &[Attribute]) to avoid callers from hardcoding what they thought the attribute name was in their own code.

if let Ok(lit) = attr.parse_args::<LitByteStr>() {
let value = lit.value();
if let Ok(extracted_path) = PathBuf::from_vec(value) {
// Record the extracted path in `path` above.
path = Some(extracted_path);
}
return true;
}
}
false
})?;
Some(InlineModPath {
path: path.expect("position() returning Some means a path was found"),
outer_attributes: &attrs[0..position],
inner_attributes: &attrs[position + 1..],
})
}

/// A builder that can configure how to inline modules.
///
/// After creating a builder, set configuration options using the methods
Expand All @@ -46,11 +98,15 @@ pub fn parse_and_inline_modules(src_file: &Path) -> syn::File {
#[derive(Debug)]
pub struct InlinerBuilder {
root: bool,
annotate_paths: bool,
}

impl Default for InlinerBuilder {
fn default() -> Self {
InlinerBuilder { root: true }
InlinerBuilder {
root: true,
annotate_paths: false,
}
}
}

Expand All @@ -71,6 +127,51 @@ impl InlinerBuilder {
self
}

/// Configures whether modules being inlined should be annotated with the paths they
/// belong to.
///
/// If this is true, every module that is inlined will have an annotation
/// `#[syn_inline_mod_path("...")]` added to it. This annotation can then be used to
/// figure out which path a particular module lives in.
///
/// This is useful when mapping `Span` instances to the modules they come in.
///
/// Default: `false`.
///
/// ## Example
///
/// If `foo_crate/src/lib.rs` contains:
///
/// ```ignore
/// #[outer_attr]
/// pub mod foo_mod;
/// ```
///
/// and `foo_crate/src/foo_mod.rs` contains:
///
/// ```ignore
/// #![inner_attr]
///
/// fn foo() {}
/// ```
///
/// then setting this option to `true` will result in a module that looks something like:
///
/// ```ignore
/// #[outer_attr]
/// #[syn_inline_mod_path(b"foo_crate/src/foo_mod.rs")]
/// #![inner_attr]
/// pub mod foo_mod {
/// fn foo() {}
/// }
/// ```
///
/// Note that paths are encoded as byte strings to allow non-Unicode paths to be represented.
pub fn annotate_paths(&mut self, annotate: bool) -> &mut Self {
self.annotate_paths = annotate;
self
}

/// Parse the source code in `src_file` and return an `InliningResult` that has all modules
/// recursively inlined.
pub fn parse_and_inline_modules(&self, src_file: &Path) -> Result<InliningResult, Error> {
Expand All @@ -97,8 +198,19 @@ impl InlinerBuilder {
// but until we're sure that there's no performance impact of enabling it
// we'll let downstream code think that error tracking is optional.
let mut errors = Some(vec![]);
let result = Visitor::<R>::new(src_file, self.root, errors.as_mut(), resolver).visit()?;
Ok(InliningResult::new(result, errors.unwrap_or_default()))
let result = Visitor::<R>::new(
src_file,
self.root,
self.annotate_paths,
errors.as_mut(),
resolver,
)
.visit()?;
Ok(InliningResult::new(
result,
errors.unwrap_or_default(),
self.annotate_paths,
))
}
}

Expand Down Expand Up @@ -152,13 +264,18 @@ impl fmt::Display for Error {
pub struct InliningResult {
output: syn::File,
errors: Vec<InlineError>,
paths_annotated: bool,
}

impl InliningResult {
/// Create a new `InliningResult` with the best-effort output and any errors encountered
/// during the inlining process.
pub(crate) fn new(output: syn::File, errors: Vec<InlineError>) -> Self {
InliningResult { output, errors }
pub(crate) fn new(output: syn::File, errors: Vec<InlineError>, paths_annotated: bool) -> Self {
InliningResult {
output,
errors,
paths_annotated,
}
}

/// The best-effort result of inlining.
Expand All @@ -177,6 +294,14 @@ impl InliningResult {
!self.errors.is_empty()
}

/// Whether paths were annotated using `InlinerBuilder::annotate_paths`.
///
/// If this is `true`, `find_mod_path` may be used to figure out the path a module is defined
/// in.
pub fn paths_annotated(&self) -> bool {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you feel about this being called has_path_annotations? In my opinion that makes it clearer this is describing the current state of something.

self.paths_annotated
}

/// Break an incomplete inlining into the best-effort parsed result and the errors encountered.
///
/// # Usage
Expand Down
21 changes: 17 additions & 4 deletions src/visitor.rs
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
use std::path::Path;

use os_str_bytes::OsStrBytes;
use proc_macro2::{Ident, Span};
use syn::visit_mut::VisitMut;
use syn::ItemMod;
use syn::{parse_quote, ItemMod, LitByteStr};

use crate::{Error, FileResolver, InlineError, ModContext};
use crate::{Error, FileResolver, InlineError, ModContext, SYN_INLINE_MOD_PATH};

pub(crate) struct Visitor<'a, R> {
/// The current file's path.
path: &'a Path,
/// Whether this is the root file or not
root: bool,
/// Whether to annotate paths for inlined modules
annotate_paths: bool,
/// The stack of `mod` entries where the visitor is currently located. This is needed
/// for cases where modules are declared inside inline modules.
mod_context: ModContext,
Expand All @@ -26,12 +30,14 @@ impl<'a, R: FileResolver> Visitor<'a, R> {
pub fn new(
path: &'a Path,
root: bool,
annotate_paths: bool,
error_log: Option<&'a mut Vec<InlineError>>,
resolver: &'a mut R,
) -> Self {
Self {
path,
root,
annotate_paths,
resolver,
error_log,
mod_context: Default::default(),
Expand Down Expand Up @@ -77,12 +83,19 @@ impl<'a, R: FileResolver> VisitMut for Visitor<'a, R> {
let mut visitor = Visitor::new(
&first_candidate,
false,
self.annotate_paths,
self.error_log.as_mut().map(|v| &mut **v),
self.resolver,
);

match visitor.visit() {
Ok(syn::File { attrs, items, .. }) => {
if self.annotate_paths {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the comment above, I wonder if we should check here that no attribute in the input is using our path, returning an error if it is?

let path = first_candidate.to_bytes();
let path = LitByteStr::new(&path, Span::call_site());
let attr_ident = Ident::new(SYN_INLINE_MOD_PATH, Span::call_site());
i.attrs.push(parse_quote! { #[#attr_ident(#path)] });
}
i.attrs.extend(attrs);
i.content = Some((Default::default(), items));
}
Expand Down Expand Up @@ -111,7 +124,7 @@ mod tests {
fn ident_in_lib() {
let path = Path::new("./lib.rs");
let mut resolver = PathCommentResolver::default();
let mut visitor = Visitor::new(&path, true, None, &mut resolver);
let mut visitor = Visitor::new(&path, true, false, None, &mut resolver);
let mut file = syn::parse_file("mod c;").unwrap();
visitor.visit_file_mut(&mut file);
assert_eq!(
Expand All @@ -129,7 +142,7 @@ mod tests {
fn path_attr() {
let path = std::path::Path::new("./lib.rs");
let mut resolver = PathCommentResolver::default();
let mut visitor = Visitor::new(&path, true, None, &mut resolver);
let mut visitor = Visitor::new(&path, true, false, None, &mut resolver);
let mut file = syn::parse_file(r#"#[path = "foo/bar.rs"] mod c;"#).unwrap();
visitor.visit_file_mut(&mut file);
assert_eq!(
Expand Down
Loading