In Pursuit of Laziness

Manish Goregaokar's blog

Mentally Modelling Modules

Posted by Manish Goregaokar on May 14, 2017 in programming, rust, tidbits

Note: This post was written before the Rust 2018 edition, and does not yet incorporate the changes made to the module system.

The module and import system in Rust is sadly one of the many confusing things you have to deal with whilst learning the language. A lot of these confusions stem from a misunderstanding of how it works. In explaining this I’ve seen that it’s usually a common set of misunderstandings.

In the spirit of “You’re doing it wrong”, I want to try and explain one “right” way of looking at it. You can go pretty far1 without knowing this, but it’s useful and helps avoid confusion.



First off, just to get this out of the way, mod foo; is basically a way of saying “look for foo.rs or foo/mod.rs and make a module named foo with its contents”. It’s the same as mod foo { ... } except the contents are in a different file. This itself can be confusing at first, but it’s not what I wish to focus on here. The Rust book explains this more in the chapter on modules.

In the examples here I will just be using mod foo { ... } since multi-file examples are annoying, but keep in mind that the stuff here applies equally to multi-file crates.

Motivating examples

To start off, I’m going to provide some examples of Rust code which compiles. Some of these may be counterintuitive, based on your existing model.

pub mod foo {
    extern crate regex;
    
    mod bar {
        use foo::regex::Regex;
    }
}

(playpen)

use std::mem;


pub mod foo {
    // not std::mem::transmute!
    use mem::transmute;

    pub mod bar {
        use foo::transmute;
    }
}

(playpen)

pub mod foo {
    use bar;
    use bar::bar_inner;

    fn foo() {
        // this works!
        bar_inner();
        bar::bar_inner();
        // this doesn't
        // baz::baz_inner();
        
        // but these do!
        ::baz::baz_inner();
        super::baz::baz_inner();
        
        // these do too!
        ::bar::bar_inner();
        super::bar::bar_inner();
        self::bar::bar_inner();
        
    }
}

pub mod bar {
    pub fn bar_inner() {}
}
pub mod baz {
    pub fn baz_inner() {}
}

(playpen)

pub mod foo {
    use bar::baz;
    // this won't work
    // use baz::inner();
    
    // this will
    use self::baz::inner;
    // or
    // use bar::baz::inner
    
    pub fn foo() {
        // but this will work!
        baz::inner();
    }
}

pub mod bar {
    pub mod baz {
        pub fn inner() {}
    }
}

(playpen)

These examples remind me of the “point at infinity” in elliptic curve crypto or fake particles in physics or fake lattice elements in various fields of CS2. Sometimes, for something to make sense, you add in things that don’t normally exist. Similarly, these examples may contain code which is not traditional Rust style, but the import system still makes more sense when you include them.

Imports

The core confusion behind how imports work can really be resolved by remembering two rules:

  • use foo::bar::baz resolves foo relative to the root module (lib.rs or main.rs)
    • You can resolve relative to the current module by explicily trying use self::foo::bar::baz
  • foo::bar::baz within your code3 resolves foo relative to the current module
    • You can resolve relative to the root by explicitly using ::foo::bar::baz

That’s actually … it. There are no further caveats. The rest of this is modelling what constitutes as “being within a module”.

Let’s take a pretty standard setup, where extern crate declarations are placed in the the root module:

extern crate regex;

mod foo {
    use regex::Regex;

    fn foo() {
        // won't work
        // let ex = regex::Regex::new("");
        let ex = Regex::new("");
    }
}

When we say extern crate regex, we pull in the regex crate into the crate root. This behaves pretty similar to mod regex { /* contents of regex crate */}. Basically, we’ve imported the crate into the crate root, and since all use paths are relative to the crate root, use regex::Regex works fine inside the module.

Inline in code, regex::Regex won’t work because as mentioned before inline paths are relative to the current module. However, you can try ::regex::Regex::new("").

Since we’ve imported regex::Regex in mod foo, that name is now accessible to everything inside the module directly, so the code can just say Regex::new().

The way you can view this is that use blah and extern crate blah create an item named blah “within the module”, which is basically something like a symbolic link, saying “yes this item named blah is actually elsewhere but we’ll pretend it’s within the module”

The error message from this code may further drive this home:

use foo::replace;

pub mod foo {
    use std::mem::replace;
}

(playpen)

The error I get is

error: function `replace` is private
 --> src/main.rs:3:5
  |
3 | use foo::replace;
  |     ^^^^^^^^^^^^

There’s no function named replace in the module foo! But the compiler seems to think there is?

That’s because use std::mem::replace basically is equivalent to there being something like:

pub mod foo {
    fn replace(...) -> ... {
        ...
    }

    // here we can refer to `replace` freely (in inline paths)
    fn whatever() {
        // ...
        let something = replace(blah);
        // ...
    }
}

except it’s actually like a symlink to the function defined in std::mem. Because inline paths are relative to the current module, saying use std::mem::replace works as if you had defined a function replace in the same module, and you can refer to replace() without needing any extra qualification in inline paths.

This also makes pub use fit perfectly in our model. pub use says “make this symlink, but let others see it too”:

// works now!
use foo::replace;

pub mod foo {
    pub use std::mem::replace;
}


Folks often get annoyed when this doesn’t work:

mod foo {
    use std::mem;
    // nope
    // use mem::replace;
}

As mentioned before, use paths are relative to the root module. There is no mem in the root module, so this won’t work. We can make it work via self, which I mentioned before:

mod foo {
    use std::mem;
    // yep!
    use self::mem::replace;
}

Note that this brings overloading of the self keyword up to a grand total of four! Two cases which occur in the import/path system:

  • use self::foo means “find me foo within the current module”
  • use foo::bar::{self, baz} is equivalent to use foo::bar; use foo::bar::baz;
  • fn foo(&self) lets you define methods and specify if the receiver is by-move, borrowed, mutably borrowed, or other
  • Self within implementations lets you refer to the type being implemented on

Oh well, at least it’s not static.




Going back to one of the examples I gave at the beginning:

use std::mem;


pub mod foo {
    use mem::transmute;

    pub mod bar {
        use foo::transmute;
    }
}

(playpen)

It should be clearer now why this works. The root module imports mem. Now, from everyone’s point of view, there’s an item called mem in the root.

Within mod foo, use mem::transmute works because use is relative to the root, and mem already exists in the root! When you use something, all child modules will see it as if it were actually belonging to the module. (Non-child modules won’t see it because of privacy, we saw an example of this already)

This is why use foo::transmute works from mod bar, too. bar can refer to the contents of foo via use foo::whatever, since foo is a child of the root module, and use is relative to the root. foo already has an item named transmute inside it because it imported one. Nothing in the parent module is private from the child, so we can use foo::transmute from bar.

Generally, the standard way of doing things is to either not use modules (just a single lib.rs), or, if you do use modules, put nothing other than extern crates and mods in the root. This is why we rarely see shenanigans like the above; there’s nothing in the root crate to import, aside from other crates specified by extern crate. The trick of “reimport something from the parent module” is also pretty rare because there’s basically no point to using that (just import it directly!). So this is not the kind of code you’ll see in the wild.



Basically, the way the import system works can be summed up as:

  • extern crate and use will act as if they were defining the imported item in the current module, like a symbolic link
  • use foo::bar::baz resolves the path relative to the root module
  • foo::bar::baz in an inline path (i.e. not in a use) will resolve relative to the current module
  • ::foo::bar::baz will always resolve relative to the root module
  • self::foo::bar::baz will always resolve relative to the current module
  • super::foo::bar::baz will always resolve relative to the parent module

Alright, on to the other half of this. Privacy.

Privacy

So how does privacy work?

Privacy, too, follows some basic rules:

  • If you can access a module, you can access all of its pub contents
  • A module can always access its child modules, but not recursively
    • This means that a module cannot access private items in its children, nor can it access private grandchildren modules
  • A child can always access its parent modules (and their parents), and all their contents
  • pub(restricted) is a proposal which extends this a bit, but it’s experimental so we won’t deal with it here

Giving some examples,

mod foo {
    mod bar {
        // can access `foo::foofunc`, even though `foofunc` is private

        pub fn barfunc() {}

    }
    // can access `foo::bar::barfunc()`, even though `bar` is private
    fn foofunc() {}
}
mod foo {
    mod bar {
        // We can access our parent and _all_ its contents,
        // so we have access to `foo::baz`. We can access
        // all pub contents of modules we have access to, so we
        // can access `foo::baz::bazfunc`
        use foo::baz::bazfunc;
    }
    mod baz {
        pub fn bazfunc() {}
    }
}

It’s important to note that this is all contextual; whether or not a particular path works is a function of where you are. For example, this works4:

pub mod foo {
    /* not pub */ mod bar {
        pub mod baz {
            pub fn bazfunc() {}
        }
        pub mod quux {
            use foo::bar::baz::bazfunc;
        }
    }
}

We are able to write the path foo::bar::baz::bazfunc even though bar is private!

This is because we still have access to the module bar, by being a descendent module.



Hopefully this is helpful to some of you. I’m not really sure how this can fit into the official docs, but if you have ideas, feel free to adapt it5!

  1. This is because most of these misunderstandings lead to a model where you think fewer things compile, which is fine as long as it isn’t too restrictive. Having a mental model where you feel more things will compile than actually do is what leads to frustration; the opposite can just be restrictive. 

  2. One example closer to home is how Rust does lifetime resolution. Lifetimes form a lattice with 'static being the bottom element. There is no top element for lifetimes in Rust syntax, but internally there is the “empty lifetime” which is used during borrow checking. If something resolves to have an empty lifetime, it can’t exist, so we get a lifetime error. 

  3. When I say “within your code”, I mean “anywhere but a use statement”. I may also term these as “inline paths”. 

  4. Example adapted from this discussion 

  5. Contact me if you have licensing issues; I still have to figure out the licensing situation for the blog, but am more than happy to grant exceptions for content being uplifted into official or semi-official docs.