Rusting

The section where I examine what I am learning when compiling Rust code.

11

SEP
2014

Experimenting With Ownership

Let's use a trivial exercise to see what we can learn about ownership, moving, borrowing, and more in Rust. Here's the idea:

  1. We'll allocate a list of numbers
  2. We'll add one to each number in the list
  3. We'll print the resulting list of numbers

This is a simple process requiring only a few lines of code:

fn main() {
    let mut numbers = vec![1u, 2, 3];
    for n in numbers.mut_iter() {
        *n += 1;
    }
    println!("{}", numbers);
}

The output is hopefully what we all expect to see:

$ ./one_function 
[2, 3, 4]

In this code there is just one variable: numbers. That variable owns a list of numbers on the heap and it's scope is limited to the main() function, which is just a way to say that the data exists for the length of that function call. Since all three steps happen in that one function call, ownership doesn't really affect us here.

To better examine what ownership really means, let's add one small twist to our exercise:

  • The increment of each number in the list must happen in a separate function

That doesn't feel like a very big change to me, but exploring this one idea has really helped me get my head around ownership in Rust. Let's talk about why.

My instinct for the simplest way to make this change to the code was to try this:

fn increment_all(mut numbers: Vec<uint>) {
    for n in numbers.mut_iter() {
        *n += 1;
    }
}

fn main() {
    let numbers = vec![1u, 2, 3];
    increment_all(numbers);
    println!("{}", numbers);
}

Unfortunately, this code doesn't compile:

$ rustc broken_move.rs 
broken_move.rs:10:20: 10:27 error: use of moved value: `numbers`
broken_move.rs:10     println!("{}", numbers);
                                     ^~~~~~~
note: in expansion of format_args!
<std macros>:2:23: 2:77 note: expansion site
<std macros>:1:1: 3:2 note: in expansion of println!
broken_move.rs:10:5: 10:29 note: expansion site
broken_move.rs:9:19: 9:26 note: `numbers` moved here because it has type
`collections::vec::Vec<uint>`, which is non-copyable
(perhaps you meant to use clone()?)
broken_move.rs:9     increment_all(numbers);
                                   ^~~~~~~
error: aborting due to previous error

The error message, like most in Rust, is pretty good here. It does assume you are familiar with some Rust concepts though. What it essentially says is that the ownership of our simple list of numbers was moved and then we tried to use what we no longer owned.

The move happens on the line in main() where the code calls increment_all(). When we passed the list as an argument to that function, the numbers variable in increment_all() took over ownership of that list. Just like before, that now means the data exists during that function call, because the variable is scoped to the function. Note that this isn't what causes the compile error. It's fine to move the list into the second function, but it sets the stage for what really went wrong here.

The error points to the very next line of main() when the code tried to print the resulting list. increment_all() has returned and we moved ownership of our list into a variable in that function. Since it's done, so is our list. Accessing it again, to print it out, is a no go.

Rust doesn't allow this because under its rules there can be only one owner of data at any given time.

When the data was moved into increment_all(), the variable inside that function became the new owner. That means the variable in main() lost ownership.

Hopefully that explains the problem, but how do we fix it? My first thought was, can ownership be moved back to main()? Sure it can:

fn increment_all(mut numbers: Vec<uint>) -> Vec<uint> {
    for n in numbers.mut_iter() {
        *n += 1;
    }
    numbers
}

fn main() {
    let mut numbers = vec![1u, 2, 3];
            numbers = increment_all(numbers);
    println!("{}", numbers);
}

This fixes the move problem. The code compiles and again produces the expected output.

The list of numbers is still moved into increment_all(). The difference here is that increment_all() returns the list at the end of the call and main() has been modified to replace the variable contents with that returned list. This moves ownership back into the variable in main() at the end of the increment_all() call, so the printing code is legal again.

Of course, this is a little silly. What if we had moved multiple arguments in? Returning all of them we needed back could be awkward at best. What we need is a way to not move the list when we call increment_all(). Rust has exactly that:

fn increment_all(numbers: &mut Vec<uint>) {
    for n in numbers.mut_iter() {
        *n += 1;
    }
}

fn main() {
    let mut numbers = vec![1u, 2, 3];
    increment_all(&mut numbers);
    println!("{}", numbers);
}

Rust has two ways that you can borrow a value. Putting a & in front of it is a read-only borrow, while a &mut is a mutable (read/write) borrow.

Here we've modified the call of the second function to pass in a borrowed reference to the list, instead of the list itself. That function was also modified to declare that it would receive this referenced type. Beyond that, the code inside increment_all() didn't really change.

This is due to a cool feature of Rust's dot operator (.): it will dereference when needed. This meant numbers.mut_iter() still worked without us needing to write something like (*numbers).mut_iter().

Now, all of these solutions use mutability to actually change the underlying numbers. I wondered if there was a way to do this without needing to edit the data. Of course, there are. Let's try using an iterator to produce the transformed list:

fn increment_all<'a, I: Iterator<&'a uint>>(numbers: I) -> Vec<uint> {
    numbers.map(|&n| n + 1).collect()
}

fn main() {
    let mut numbers = vec![1u, 2, 3];
            numbers = increment_all(numbers.iter());
    println!("{}", numbers);
}

The idea here is to avoid passing in the full list in favor of an iterator that can be used to walk the list. Combined with a call to map(), a new list of modified numbers is produced and returned as a result.

The first new element here is that I've switched to using generics to describe the incoming parameter. This allows the function to take anything that implements the trait Iterator. We're no longer restricted to just a vector.

The second new element here is the lifetime I had to introduce: 'a. The reason is that the function now takes references to some numbers and the period those numbers will live is known in the caller, main(). The explicit specifier of the lifetime here is Rust's way of indicating that those numbers must survive at least through the increment_all() function call, so they can be used in that call.

Technically, there's still one mut in the code above. If we really want to remove that, we can just store the result in a separate variable:

fn increment_all<'a, I: Iterator<&'a uint>>(numbers: I) -> Vec<uint> {
    numbers.map(|&n| n + 1).collect()
}

fn main() {
    let numbers     = vec![1u, 2, 3];
    let incremented = increment_all(numbers.iter());
    println!("{}", incremented);
}

We now have a truly immutable program, which may or may not be helpful in an actual application.

There is another step we might take here. I mentioned before how we have a generic input now that allows us to accept a wider range of arguments. That's nice, but we didn't use an equally flexible return value. We always generate a list of numbers. What if you wanted to tack on other iterators, further modifying the data, before you generated the final list? Can we just return the in-progress Iterator with the map() tacked on? Yes and no.

Rust doesn't currently allow for generic return values, so we can't just say we'll give you back an Iterator. However, we can lookup what map() actually returns and specify that:

fn increment_all<'a, 'r, I: Iterator<&'a uint>>(numbers: I)
    -> std::iter::Map<'r, &'a uint, uint, I> {
    numbers.map(|&n| n + 1)
}

fn main() {
    let numbers                = vec![1u, 2, 3];
    let incremented: Vec<uint> = increment_all(numbers.iter()).collect();
    println!("{}", incremented);
}

This is almost surely the most flexible implementation of increment_all(). It could be combined with other iterators when needed. That said, I feel like the type definitions had to get extra cryptic to allow this. I had to lookup the details of the Map object and add another lifetime for the returned value: 'r. Rust also needed me to explicitly type a variable that I didn't need to before. It feels like a bit of a tradeoff to me.

Most of the above is what I've managed to get out of the not-well-named Lifetimes Guide after many readings and some clarifying discussions in Rust's IRC channel. I know that new and improved documentation for these concepts is in the works.

Comments (0)
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader