Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the generated vector-like object use a single length #26

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mangelats
Copy link
Contributor

@mangelats mangelats commented Sep 3, 2020

So I've finished with a working version.

Notice that there are a lot of methods missing and tests won't work because of that. Also #sao_derive and zip_iter! not implemented for now.

closes #19

@Luthaf
Copy link
Member

Luthaf commented Sep 3, 2020

Thanks you for the PR! I'll have a look at it when I can find some time, probably this weekend.

@mangelats
Copy link
Contributor Author

By the way, this PR points to master but in reality it should point a new branch.

#visibility fn iter(&self) -> Iter {
Iter(#create_iter)
}
pub struct VecIter<'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replace the iterator type as well? The previous one should still be constructible through the individual slices. The previous type should also optimize a bit better since we get all the marker traits (ExactSizeIterator and friends) from std, as well as specialization which std can use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't understand what the std was doing (RawVec doesn't have .iter()) because the code is not straight forward, so I made my simple iterator. Actually 2 days ago I realized that the standard is simply converting to slice and using its iterator. So yes, we could keep the old code with minor changes.

A point for a future debate would be if making a well-implemented custom iterator may be made more performant than using nested zips (but with the obvious downside of more code maintenance)

if self.n >= self.vec.len() {
return (0, Some(0))
}
let left = self.vec.len() - self.n;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: how about remaining instead of left?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

impl<'a> IntoIterator for #slice_mut_name<'a> {
type Item = #ref_mut_name<'a>;
type IntoIter = #detail_mod::IterMut<'a>;
impl<'a> IntoIterator for &'a #vec_name {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These implementation where guarded by if let Visibility::Public(_) = *visibility previously, why remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually remade it when replacing the type. I'll reverse this file and change it to use the slices instead of a custom iterator

impl<'a> IntoIterator for &'a mut #vec_name {
type Item = #ref_mut_name<'a>;
type IntoIter = #detail_mod::IterMut<'a>;
impl<'a> IntoIterator for &'a mut #vec_name {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also missing implementations for slice & slice mut

@@ -191,7 +191,7 @@ pub fn derive(input: &Input) -> TokenStream {
})
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you remove the additional whitespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so sorry about that. It's a misconfiguration of my editor.

#[doc = #vec_name_str]
/// ::shrink_to_fit()`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit)
/// shrinking all fields.
pub fn shrink_to_fit(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be added back later, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes :)

}

/// Similar to [`
#[doc = #vec_name_str]
/// ::truncate()`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.truncate)
/// truncating all fields.
pub fn truncate(&mut self, len: usize) {
#(self.#fields_names_1.truncate(len);)*
unsafe {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this taken from std::vec::Vec implementation?

Copy link
Contributor Author

@mangelats mangelats Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a modified version of it for multiple raw vectors :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it's the same for most other methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add comments stating that

#(self.#fields_names_1.push(#fields_names_2);)*
self.reserve(1);
#(write_to_raw_vec(&mut self.data.#fields_names_1, #fields_names_2, self.len);)*
self.len += 1;
}

/// Similar to [`
#[doc = #vec_name_str]
/// ::len()`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.len),
/// all the fields should have the same length.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part of the doc is no longer required \o/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D

}

/// Similar to [`
#[doc = #vec_name_str]
/// ::insert()`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.insert).
pub fn insert(&mut self, index: usize, element: #name) {
fn insert_into_raw_vec<T>(buf: &mut RawVec<T>, len: usize, value: T, index: usize) {
unsafe {
// infallible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on what you mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the original std code.

I believe that what is saying is that this code is actually safe. Not 100% sure though.


/// Create a slice of this vector matching the given `range`. This
/// is analogous to `Index<Range<usize>>`.
pub fn slice(&self, range: ::std::ops::Range<usize>) -> #slice_name {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to be added back later?

@Luthaf
Copy link
Member

Luthaf commented Sep 13, 2020

Overall the implementation looks sane, but I think it is missing documentation & tests now that we are using unsafe directly.

Since there is quite a bit of work to be done here, what's your preferred way of going forward? I can create a separate branch to merge this even if it is unfinished. If you want to publish this on crates.io, I would like to keep compatibility with the stable compiler, so that would mean using a feature to enable this optimization. Otherwise you can use the code directly from the git branch in your own code as well.

@mangelats
Copy link
Contributor Author

Thank you for the review! :)

I believe the end goal should be to have this code merged in master, disabled under a feature by default.

I'm thinking what would be the safest way of doing things and I think I came with an idea:

  • Make an issue tracking the smaller issues and a new branch from master to make the PRs against.
  • Decide the code organization. I'm leaning towards having 3 folders (modules): stable, single_len and common. The reason for that is that the code for some files has nothing in common from the two implementations. I tried doing something like single_len_vec.rs and so on but it ends up being a bit dirty if everything is in the same folder.
  • When the other two things are done, remove all tests. Then follow the cycle: open issue for a single method or additionmake test about what we expect about it (and ensure that it fails)implement itPR.

This would generate a lot of really small issues but I feel that everything would be better organized and we will be sure that the tests work. This means discarding this PR in favour of making smaller ones (the code can be copy-pasted from here).

What do you think?

@Luthaf
Copy link
Member

Luthaf commented Sep 20, 2020

I agree that this could be built against a separate branch as multiple smaller PR, I've created a nightly branch for this!

Decide the code organization. I'm leaning towards having 3 folders (modules): stable, single_len and common. The reason for that is that the code for some files has nothing in common from the two implementations. I tried doing something like single_len_vec.rs and so on but it ends up being a bit dirty if everything is in the same folder.

Since only the XXXVec part should change, I would prefer to rename the current vec.rs file to stable_vec.rs, and create a new ``nightly_vec.rs`; and then select one with

#[cfg_attr(feature = "nightly", path = "nightly_vec.rs")]
mod vec;

#[cfg_attr(not(feature = "nightly"), path = "stable_vec.rs")]
mod vec;

Keeping the rest of the code the same as much as possible.

When the other two things are done, remove all tests.

I don't see why this is required. I would rather keep the current tests, potentially using unimplemented!() in the generated code where necessary to keep it compiling.

On top of that we would want to add more tests (one for for each Vec function, potentially following the example of std), which can be tracked separately.

Then follow the cycle: open issue for a single method or addition → make test about what we expect about it (and ensure that it fails) → implement it → PR.

That's one way of doing it, I would be fine with it but I feel it would generate a lot of unnecessary noise. What would be wrong with a single issue containing a list of functions to implement (taken from std)? Do you expect the make test about what we expect about it (and ensure that it fails) step to need a lot of discussion?

Anyway, I would be fine with multiple smaller issues, but (as you can see from my reply time) I have somehow limited bandwidth to work on this repository =)


As a starting point, a minimal implementation of single length vec using unimplemented!() as required, adding a cargo feature to select the new implementation and enabling it in CI looks like the way forward to me.

@Luthaf
Copy link
Member

Luthaf commented Sep 20, 2020

Also, before spending too much time working on tests & setup for this single length optimization, it would be good to have the code in a state able to run benchmarks (even without documentation or tests), to at least validate this is a worthy investment of your time!

@mangelats
Copy link
Contributor Author

Also, before spending too much time working on tests & setup for this single length optimization, it would be good to have the code in a state able to run benchmarks (even without documentation or tests), to at least validate this is a worthy investment of your time!

For me, the added correctness of a single length is reason enough to invest some time to it :)
That said, I'm eager to see if it would make a difference. How about we design a benchmark? What would we need to do so?

Since only the XXXVec part should change, I would prefer to rename the current vec.rs file to stable_vec.rs, and create a new nightly_vec.rs; [...]

Actually vec.rs and iter.rs will need to be versioned, but the same could be done with both of them. What I was thinking was exactly the same but separating them in folders (stable/vec.rs, stable/iter.rs, nightly/vec.rs and nightly/iter.rs); either way is fine for me.

As for removing the tests and making them again, I thought that doing tests and code at the same time would improve both, but looking at it now it's probably a waste of time: we can always make betters tests later on.

@mangelats mangelats marked this pull request as draft October 2, 2020 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use one length and capacity variable for whole struct
2 participants