What is Rust's Into<T> for?

Rust provides a bunch of traits that you may use or implement in your code, but unless you have experienced them first-hand, it can be hard to imagine what their real utility is. For example, if you go read Into’s documentation, all you find is:

Trait std::convert::Into
A value-to-value conversion that consumes the input value. The opposite of From. […]

Yay, very useful. This text tells me what this trait does, which is fine for a reference manual, but not when I could find it useful. Mind you, during the initial code reviews for sandboxfs, my reviewer pointed out a few times that I should be using into() or into_iter() in a few places but I never quite figured out why.

This all clicked last week while I was working on EndBASIC. In there, I had a function signature for a type that looked something like this:

struct VarRef {
    name: String,
    ref_type: VarType,
}

impl VarRef {
    fn new(name: String, ref_type: VarType) -> Self {
        Self { name, ref_type }
    }
}

VarRef represents a reference to a variable, and this representation owns the contents of the name field (i.e. the storage for the name string belongs exclusively to this type).

This object is typically constructed while parsing the input source file, which means that the name comes from a string that was already built during the parsing process. Now, the interesting thing about parsing is that a parser goes through the input once and constructs a tree (an AST) to represent that input. The AST typically has a 1:1 mapping to the input, so any string that is constructed during the parsing process should be reused to construct that tree—in theory without duplicating it.

Given this, I started with the interface for VarRef::new that I presented above because it lets us have code like:

let s: String = reader.next_word();
match s {
    "AND" => Token::And,
    "OR" => Token::Or,
    ...
    _ => Token::Symbol(VarRef::new(s, VarType::Auto)),
}

In this code snippet, we start by reading a word out of the source file via the next_word() call and storing it into an owned string. Then we check if the string happens to be a keyword: if it is, we transform it into its AST representation and throw away the string. But if we are left with a variable reference, we have to construct its VarRef representation, which requires the name of the variable. Because we don’t need that name any longer in the context of the parser, we can move it into the AST (the VarRef) and thus avoid memory copies. (Memory copies can be expensive in aggregate!)

Problems arose when writing tests because the ergonomics were very poor. All of my test code had stuff of the form "some_literal".to_owned() everywhere (last I counted, about 50 instances), which made lines long and subject to auto-wrapping by rustfmt:

assert_eq!(
    VarRef::new("some_name".to_owned(), VarType::Auto),
    ...);

I was tempted to change the signature of the VarRef::new to take a &str instead of an owned String and then make the function itself call .to_owned(), like this:

fn new(name: &str, ref_type: VarType) -> Self {
    Self { name: name.to_owned(), ref_type: ref_type }
}

However… that seemed very wrong: such a change would make the code simpler to read, sure, but it would result in all calls to this function to duplicate the input string, penalizing the “production” version of this code. The tradeoff was not correct: you should definitely improve the testability of your production code, but you should rarely make it less efficient because of it.

The reality is that we need two versions of this function: one that takes ownership of the string and is used for production code, and one that provides syntactic sugar to simplify the test code. And this is where Into rescues us.

With the Into trait, we can mark our function as taking an object that can be transformed into something else if it needs to. Thus we can say:

fn new<T: Into<String>>(name: T, ref_type: VarType) -> Self {
    Self { name: name.into(), ref_type: ref_type }
}

Which means: name is now of an owned type (it is not a reference, so it will be moved) that can be transformed into a String, and thus these two call syntaxes become possible:

let owned_string_from_parsing = String::from("something");
VarRef::new(owned_string_from_parsing, VarType::Auto);  // Zero copies!
// owned_string_from_parsing moved and cannot be referenced here any longer.

VarRef::new("some_literal_in_tests", VarType::Auto);  // One copy.

In the production code version, because the values we give to this function are already Strings, the into() call does nothing and is elided at compilation time. We are left with the original (zero-cost) move semantics that we had. But in tests, because we supply &str, and because Into<String> is implemented for them, the .into() call ends up allocating a copy of the string (just as we had before with the explicit .to_owned() calls, so it’s not worse).

Hope this makes this one trait clearer in case you came here looking for answers. And once you understand how they play in practice, you start seeing their applicability everywhere! If your only tool is a hammer…

Featured software

Featured posts