Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multipage Deallocator #19

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Multipage Deallocator #19

wants to merge 16 commits into from

Conversation

DevonSchwartz
Copy link
Collaborator

Changed data page deallocator to take a list of data pages and deallocate them without acquiring and releasing lock after each dealloc.

@@ -292,6 +294,7 @@ impl PageAllocator for Option<PerCpuPageAllocator> {

fn dealloc_dir_page_list(&self, pages: &DirPageListWrapper<Clean, Free>) -> Result<()> {
if let Some(allocator) = self {
let page_list = pages.get_page_list_cursor();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line, since we obtain the cursor for the list on the next line.

@@ -343,6 +346,83 @@ impl PerCpuPageAllocator {
Ok(())
}
}


fn dealloc_multiple_page(&self, mut page_list : Cursor<'_, Box<LinkedPage>>) -> Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to move this code into dealloc_data_page_list so that we don't have to pass a Cursor type around.



fn dealloc_multiple_page(&self, mut page_list : Cursor<'_, Box<LinkedPage>>) -> Result<()> {
// hash map to store free list lock for every cpu
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this comment -- it stores the free list for each CPU, not the lock, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I made sure to change the comment.

if let Some(page) = page {
pr_info!("Page: {}", page.get_page_no());

let cpu : usize = ((page.get_page_no() - self.start) / self.pages_per_cpu).try_into()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a function defined elsewhere to do this calculation (if not, we should make one).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go ahead and make a helper function to do this calculation.


let cpu : usize = ((page.get_page_no() - self.start) / self.pages_per_cpu).try_into()?;

if matches!(cpu_free_list_map.get(&cpu), None) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are only matching against one pattern, and it's the Option type, it would be more idiomatic/cleaner to do if cpu_free_list_map.get(&cpu).is_none().

}

// add cpu page to vector (vector is mutable)
let cpu_page_vec : &mut Vec<PageNum> = cpu_free_list_map.get_mut(&cpu).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ? rather than unwrap here -- even though this should always succeed, it's safer to avoid using unwrap wherever possible, since the kernel will panic and crash if we call unwrap on an error.

Copy link
Collaborator Author

@DevonSchwartz DevonSchwartz Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I change it to ?, I get the following compiler error:

"cannot use the ? operator in a method that returns `core::result::Result<(), kernel::Err

Update: I think I made the unwrap safe by calling is_some() on the option returned by the get_mut() function

}
};

// check that the page was not already present in the tree. CAUSING ERRORS DUE TO TYPE ANNOTATION ISSUES
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What error is this running into?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to resolve that error but forgot to delete the comment. I'll make sure to delete it.

@DevonSchwartz
Copy link
Collaborator Author

Applied changes. The deallocator has been tested with 1 CPU, but I am struggling to create a scenario where pages are allocated on multiple CPUs.

@DevonSchwartz
Copy link
Collaborator Author

Confirmed that pages removed from all cpus and removed prinfo statements.

Comment on lines 277 to 282
if cpu_free_list_map.get(&cpu).is_none() {
cpu_free_list_map.try_insert(cpu, Vec::new())?;
}

// add cpu page to vector (vector is mutable)
let cpu_page_vec_option : Option<&mut Vec<PageNum>> = cpu_free_list_map.get_mut(&cpu);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner (and potentially more efficient) to take out the if statement that checks if cpu is in cpu_free_list_map; instead, you can obtain cpu_page_vec_option as you currently do, and then rather than returning an error if cpu_page_vec_option is None below, create a new vector with the page number and insert it into the map.

Also -- the convention here would be to just name the vector cpu_page_vec; it doesn't have to reflect the fact that it's an Option.

Comment on lines 284 to 290
if cpu_page_vec_option.is_some() {
let cpu_page_vec : &mut Vec<PageNum> = cpu_page_vec_option.unwrap(); // safe unwrap
cpu_page_vec.try_push(page.get_page_no())?;
} else {
pr_info!("CPU not added to RBTree\n");
return Err(EINVAL);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another Rust idiom -- you can structure this code as follows:

if let Some(cpu_page_vec) = cpu_page_vec_option() {
    cpu_page_vec.try_push(page.get_page_no())?;
} else {
    . . . 
}

It's a bit more concise. If you rename cpu_page_vec_option to cpu_page_vec, Rust will also let you write the if condition like this:

if let Some(cpu_page_vec) = cpu_page _vec 

Within the if case, cpu_page_vec will refer to the vector itself; outside of that, the variable name will still refer to the Option<Vec<>>.


fn dealloc_multiple_page(&self, cpu_free_list_map : RBTree<usize, Vec<PageNum>>) -> Result<()> {
// add pages to corresponding free list in ascending cpu # order
for (cpu, page_nos) in cpu_free_list_map.iter() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure -- have you confirmed/checked that this always iterates over the CPU keys in ascending order?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've confirmed through dmesg that this iterates over CPU keys in ascending order (I think the RB tree default is ascending), but I don't know how to formally test it, so I'll remove the comment.

Comment on lines 412 to 415
fn pageno2cpuid(&self, page_no : PageNum) -> Result<usize> {
let cpu: usize = ((page_no - self.start) / self.pages_per_cpu).try_into()?;
Ok(cpu)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not already a function for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found one that does this in balloc.rs. I noticed in dealloc_page the same formula is used to calculate the cpu.

reconfig.sh Outdated
Comment on lines 1 to 12
#!/bin/bash
sudo umount /dev/pmem0
sudo rmmod hayleyfs
make LLVM=-14 fs/hayleyfs/hayleyfs.ko
sudo insmod fs/hayleyfs/hayleyfs.ko
sudo mount -o init -t hayleyfs /dev/pmem0 /mnt/pmem
touch test.txt
echo "Hello World" > test.txt
sudo mv test.txt /mnt/pmem
rm test.txt
cd /mnt/pmem
pwd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you written any tests beyond this? I think it would be good to add (in a tests/ directory at the top level) more cases, e.g. larger files, checking that everything works properly when there are multiple files in the system, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two tests - one testing a large file and one testing a series of small files.

Copy link
Collaborator

@hayley-leblanc hayley-leblanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multi-page deallocate in SquirrelFS looks good! I've left a few suggestions/comments on the tests.

long int enlarge_file(char *path, long int size) {
int fd = open(path, O_CREAT | O_RDWR);
assert (fd > -1);
lseek(fd, size, SEEK_SET);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually increase the file's size? According to lseek documentation (https://man7.org/linux/man-pages/man2/lseek.2.html),

lseek() allows the file offset to be set beyond the end of the file (but this does not change the size of the file).

Edit: I see, the increase comes from the fputc writing a character at offset size. I would still suggest switching to one of the following approaches.

A more standard way of increasing the file's size (and one that would be sure to exercise page allocation code paths) would be to use the write system call to write a specified amount of data to the file: https://man7.org/linux/man-pages/man2/write.2.html, or the truncate syscall to set its size to a specified value. It doesn't actually matter what is written, so you could allocate a buffer but not update/set any of its bytes and just write whatever is already in it to the file. You can also do this with a FILE* obtained from fopen/fdopen using a function like fprintf, but the write system call may be easier because I think fprintf requires you to provide bytes to write as a string.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion. I added a write call and it was much cleaner.

fputc('\0', fp);
fclose(fp);
close(fd);
fd = open(path, O_CREAT | O_RDWR);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why close and reopen the file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are write. I over complicated the code with lseek and changed to a write call.

Comment on lines 37 to 40
long int new_size = enlarge_file(path, multiple * PAGESZ);
used_all_pages = prev_size == new_size;
multiple = used_all_pages ? multiple : multiple + 1;
prev_size = new_size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand exactly what this loop is doing; why increase multiple in this way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop was intended to extend the file to use the maximum amount of pages before we run out, but that was overcomplicated. I now borrow the code from large_write and allocate 200000 pages instead.

prev_size = new_size;
}
// a new file with one less page should be able to be allocated after removal
assert(prev_size > 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A stronger check would be to assert that the file you created is the expected size.

Comment on lines 18 to 27
for (int i = 0; i < num_files; i ++) {
sprintf(filename, "/mnt/pmem/%d", i);
int fd = open(filename, O_CREAT | O_RDWR);
lseek(fd, PAGESZ * 2, SEEK_SET);
FILE *fp = fdopen(fd, "w");
assert(fp);
putc('\0', fp);
fclose(fp);
close(fd);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as the other test w.r.t. increasing file size.


assert(statvfs("/mnt/pmem", &stat) == 0);
unsigned long pages_end = stat.f_bfree;
assert(pages_start == pages_end);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another useful sanity check would be to ensure that between creating and deleting the big file, the number of pages in use has changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants