Instantiate struct without setting values

I'm parsing some binary data and need to build up a struct in memory as I go.

In C I would have an instance of my struct and fill in the fields as I parse the data. However, in Rust I can't create an instance of my struct without providing values for all the fields.

I tried using #[derive(Default)] which works for simply data types (such as u8, u32, etc), but when I have a fixed sized array in the struct it breaks.

For a simple case I could parse into local variables and make the struct at the end with those values, but this doesn't scale very well as I parse more complex cases.

Is there a more idiomatic way to do this in rust?

#[derive(Default)]
struct MyData
{
    header: [u8; 2],
    data1: u16,
    data2: u32,
}

fn parse_mydata() -> MyData
{
    let mut data : MyData;
    
    data.header[0] = 42;
    data.header[1] = 65;
    data.data1 = 3;
    data.data2 = 9;
    
    return data;
}

fn main() 
{
    let data = parse_mydata();
    
    println!("{} {}", data.header[0] as char, data.header[1] as char);
}

And these are the errors I get.

error[E0381]: use of possibly uninitialized variable: `data.header`
  --> src/main.rs:14:5
   |
14 |     data.header[0] = 42;
   |     ^^^^^^^^^^^^^^^^^^^ use of possibly uninitialized `data.header`

error[E0381]: use of possibly uninitialized variable: `data.header`
  --> src/main.rs:15:5
   |
15 |     data.header[1] = 65;
   |     ^^^^^^^^^^^^^^^^^^^ use of possibly uninitialized `data.header`

error[E0381]: use of possibly uninitialized variable: `data`
  --> src/main.rs:19:12
   |
19 |     return data;
   |            ^^^^ use of possibly uninitialized `data`
1 Like

It's not the fixed size per se, but rather that Default isn't derived for arrays larger than 32 elements. There are a few options to consider ...

You can define a struct that holds the (incrementally) built-up state. To easily derive Default and to help yourself write correct initialization, this struct can have all fields wrapped in an Option. For example:

#[derive(Default)]
struct BuilderState {
   header: Option<[u8; 64]>,
   data1: Option<u16>,
   data2: Option<u32>, 
}

As you can surmise, the default value of this struct is all fields set to None. As you parse, you can set them to Some(...), and ultimately create your real struct from this one (possibly defaulting unset options to whatever makes sense).

For fields like data1 and data2, if 0 is actually a fine default, then you can skip wrapping them in Option to begin with.

You can also approach this like you do in C, which would involve making use of std::mem::zeroed() or std::mem::uninitialized(); both of these are unsafe functions. The former creates an all-0 representation of your type; the latter doesn't zero anything. In both cases, you have to be very careful to ensure that you don't leak out your type until everything has been set correctly, and that no fields of your type are left in a bogus state; the bogus state might be a field whose type does not allow a 0 value, or if you simply have some random value for some type (if using uninitialized).

You can also break your build up of state into pieces - instead of trying to set the fields of one large struct, create smaller structs that encapsulate certain chunks of the overall state. Then, as your parse progresses, you can build up just those chunks from locals, and ultimately, use the individual chunks to make the one large struct. This is quite a bit of boilerplate, however.

3 Likes

I'm not sure what the idiomatic way is, I probably would have tried using options as suggested by @vitalyd.

However, as to the error use of possibly uninitialized variable: `data.header` .

The reason you are getting that error is because you have an uninitialized MyData struct:

// This is uninitialized.
let mut data: MyData;

Even though you did derive the default trait it is still uninitialized. You can see that the default trait has a default method to return the default structure. To initialize with default values you need to call this method:

// This is initialized with default values.
let mut data: MyData = MyData::default();

And your example runs.

Try to use "as" as little as possible.

    let mut data : MyData;    
    data.header[0] = 42;
    data.header[1] = 65;
    data.data1 = 3;
    data.data2 = 9;   
    return data;

In theory the Rust compiler could eventually become a little smarter and accept code like that, where you assign all struct fields.

I would collect members in local variables, and instantiate at the end using the field init shorthand

https://doc.rust-lang.org/book/second-edition/ch05-01-defining-structs.html#using-the-field-init-shorthand-when-variables-and-fields-have-the-same-name

struct Foo {
    a: u32,
    b: f64,
    c: String,
}

fn init () -> Foo {
    let a = 5;
    let b = 4.3;
    let c = "things".to_string();
    Foo {a, b, c}
}

Edit: I missed this on the first pass through, sorry. Protip: don't post at 3am.

I still think this is - more or less - the cleanest approach. The question is really how to handle intermediate state as that complexity grows, and while it obviously depends on the specifics of the problem domain, in general I'd rather not put that complexity into the target struct. I'd rather make it more explicit in construction, e.g. via fragment parsers that return and propagate Result rather than using Option for fields that are not-actually-optional post-construction.

Another possibility, if you have a few clear stages of construction, would be an enum that you progress at each stage.

enum PartiallyParsed {
  Step1 { a: i32, b:f32, c:String },
  Step2 { a: i32, b:f32, c:String, d:u32, e:[u8; 10] },
  Step3 { a: i32, b:f32, c:String, d:u32, e:[u8; 10], ... },
  ...
}

There are some great examples of complex parsers built up out of smaller, well-structured parts using the nom crate, in case you've not already seen that.

2 Likes

Safe partial initialization in Rust?

There are a lot of parsers written in rust. They are probably where you want to start learning about this stuff. In particular, the nom parser-combinator library is very useful for building small parsers that can be combinded using alt! and do_parse!.

Plus it has help for stuff like converting endianness etc.

Parsing binary data of pods easily (or even more complicated stuff) is precisely what scroll was designed for.

In your example, assuming your binary data was written out in a packed repr(C) like manner, you can do even use the automatic derive implementation:

#[macro_use] extern crate scroll_derive;
extern crate scroll;

use scroll::Pread;

#[derive(Pread)]
struct MyData
{
    header: [u8; 2],
    data1: u16,
    data2: u32,
}

fn main() 
{
    let bytes = [0xde, 0xad, 0xbe, 0xef, 0xbe, 0xef, 0xde, 0xad];
    let data: MyData = bytes.pread(0).unwrap();
}

It’s pretty well documented, does endiannnes stuff too, etc. Let me know if you have any issues/problems if you end up using it :slight_smile: good luck!

2 Likes

Thanks for all the responses, I really appreciate it

Ah, I clearly missed that part!

I'm aware of the various parsing libraries available, I've simplified my actual case a lot for this post (its not a 1:1 mapping of parsing to data structure).

I hadn't considered wrapping things in options inside the struct, I think that will work for me in some places.