Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ark-cli for Improved Code Organization and Readability #67

Merged
merged 12 commits into from
Jun 30, 2024

Conversation

tareknaser
Copy link
Collaborator

Description

Previously, ark-cli logic was concentrated in a few files, with large files such as ark-cli/src/main.rs containing all the logic for running the CLI by calling other functions. This made it difficult to debug, patch, and read the code. This PR is the first step towards addressing other issues within the crate.

Changes

  • Each command now resides in its own file, including its parameters and the function to run the logic.
  • Each subcommand is placed in a folder, with each command as a separate file.
  • All commands are included in a module that exports pub enum Commands.
  • Defined a run() method to handle the logic for each command.
  • The main() function now only calls run(), and if an error is encountered, it prints it to stderr. This enhances error handling.
  • Avoided using unwrap() when applicable.
  • Added FIXME notes to track planned changes.

Copy link

Benchmark for 23e3961

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 251.2±1.85µs 250.5±1.00µs -0.28%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.06µs 15.6±0.07µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1343.9±2.21ns 1354.2±9.28ns +0.77%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 195.2±0.44µs 195.9±0.44µs +0.36%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1715.2±5.72µs 1742.1±26.41µs +1.57%
crc32_resource_id_creation/compute_from_bytes:large 86.6±0.73µs 86.7±0.37µs +0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.3±0.33ns 92.3±0.38ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.3±0.27µs 64.8±1.34µs +0.78%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 929.3±4.43µs 937.8±4.48µs +0.91%
index_build/index_build/../test-assets/ 1060.3±7.42µs 1053.3±7.93µs -0.66%

@tareknaser tareknaser marked this pull request as ready for review May 25, 2024 04:40
Copy link

Benchmark for 49b25b8

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 252.4±0.65µs 250.3±1.12µs -0.83%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.22µs 15.7±0.11µs +0.64%
blake3_resource_id_creation/compute_from_bytes:small 1338.7±9.73ns 1352.0±8.61ns +0.99%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.0±0.40µs 196.8±3.11µs +0.41%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1730.9±9.42µs 1707.7±18.59µs -1.34%
crc32_resource_id_creation/compute_from_bytes:large 86.8±0.37µs 86.9±0.65µs +0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.09µs 5.4±0.04µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.61ns 92.4±0.50ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.8±0.18µs 65.1±0.49µs +0.46%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 922.9±4.70µs 934.5±3.47µs +1.26%
index_build/index_build/../test-assets/ 1063.8±6.79µs 1049.0±10.58µs -1.39%

@@ -35,12 +41,13 @@ impl Append {
translate_storage(&Some(self.root_dir.to_owned()), &self.storage)
.ok_or(AppError::StorageNotFound(self.storage.to_owned()))?;

// FIXME: Why do we have `self.kind` and `self.storage`? Both are used to determine the storage type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind is probably file or folder. It could be derived from names, at least for standard storages provided by ARK. But developers could also create their own storages which would be unknown to ark-cli, so the user would need to pass this information explicitly.

We could also figure out the storage type automatically by checking the path. But with atomic versions the path will always point to a folder, so we need to calculate depth of the folder. Folder storages would have depth of 3 (storage/entry/version/value), while file storage should have depth 2 (a file inside each version, i.e. storage/version/table).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind is probably file or folder.

Right.

  • storage: Required parameter of type String. Gets passed to translate_storage() to determine the file path and storage_type
  • kind: Optional parameter of type StorageType and specifies if it's a file or folder. It overwrites the value for storage_type

Why would we need kind if storage already determines the storage type? Are there any cases where using kind would come in handy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only real use-case for kind is custom storages, i.e. when the storage is not shipped by ARK framework, but the user implemented their own instance of FileStorage or FolderStorage. In this case, we can't figure it out by the path. But we still can figure it out by the folder structure as I've described above.

Overall, kind flag is more like a temporary helper. We should determine it automatically.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.
I removed the FIXME since there is a reason to have both flags

@tareknaser tareknaser force-pushed the organize_in_folders branch from e1646c9 to 4296f73 Compare June 1, 2024 14:28
Copy link

github-actions bot commented Jun 1, 2024

Benchmark for bb2f958

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 247.8±0.54µs 248.9±0.47µs +0.44%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.04µs 15.7±0.10µs +0.64%
blake3_resource_id_creation/compute_from_bytes:small 1330.8±8.99ns 1358.8±12.63ns +2.10%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 195.5±0.70µs 195.7±0.84µs +0.10%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1722.0±9.17µs 1726.9±15.24µs +0.28%
crc32_resource_id_creation/compute_from_bytes:large 86.6±0.23µs 86.7±0.77µs +0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.03µs 5.4±0.08µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.30ns 92.5±1.39ns +0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.4±1.50µs 64.5±0.65µs +0.16%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 921.6±2.57µs 937.1±2.30µs +1.68%
index_build/index_build/../test-assets/ 1086.4±6.51µs 1058.1±2.96µs -2.60%

Copy link

github-actions bot commented Jun 1, 2024

Benchmark for ae97c7f

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 248.5±1.16µs 251.0±0.89µs +1.01%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.04µs 15.8±0.07µs +1.28%
blake3_resource_id_creation/compute_from_bytes:small 1344.8±7.71ns 1345.5±2.65ns +0.05%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 198.2±1.46µs 195.8±1.95µs -1.21%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1750.1±4.25µs 1718.0±11.32µs -1.83%
crc32_resource_id_creation/compute_from_bytes:large 86.9±0.30µs 86.7±0.23µs -0.23%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.02µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.28ns 92.5±2.16ns +0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.5±0.85µs 64.7±4.02µs +0.31%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 923.9±2.13µs 930.2±3.29µs +0.68%
index_build/index_build/../test-assets/ 1067.2±11.39µs 1048.4±3.40µs -1.76%

Copy link

github-actions bot commented Jun 2, 2024

Benchmark for 3bf3092

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 249.9±1.35µs 249.0±0.49µs -0.36%
blake3_resource_id_creation/compute_from_bytes:medium 15.5±0.02µs 15.6±0.08µs +0.65%
blake3_resource_id_creation/compute_from_bytes:small 1344.9±2.20ns 1350.9±6.24ns +0.45%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 197.5±0.32µs 195.8±0.63µs -0.86%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1723.7±4.99µs 1725.1±21.80µs +0.08%
crc32_resource_id_creation/compute_from_bytes:large 86.8±0.39µs 86.8±0.34µs 0.00%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.05µs 5.4±0.01µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.57ns 92.4±0.41ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.4±0.25µs 64.8±0.60µs +0.62%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 931.1±1.87µs 935.6±6.71µs +0.48%
index_build/index_build/../test-assets/ 1061.9±5.27µs 1052.9±3.81µs -0.85%

@kirillt
Copy link
Member

kirillt commented Jun 9, 2024

@tareknaser could you investigate where does this output come from? We need to register a bug, I guess.

temp % ark-cli list --link 
Loading app id at /Users/tareknasser/.ark...
http://google.com/                                                                                                                                                               
http://duckduckgo.com/                                                                                                                                                           
Username; Identifier;First name;Last name
booker12;9012;Rachel;Booker
grey07;2070;Laura;Grey
johnson81;4081;Craig;Johnson
jenkins46;9346;Mary;Jenkins
smith79;5079;Jamie;Smith

Copy link

Benchmark for 7b93af4

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 252.0±1.08µs 249.8±1.18µs -0.87%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.08µs 15.8±0.11µs +1.28%
blake3_resource_id_creation/compute_from_bytes:small 1368.7±7.85ns 1385.9±3.18ns +1.26%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 197.2±3.27µs 195.7±0.48µs -0.76%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1738.1±8.35µs 1706.1±28.24µs -1.84%
crc32_resource_id_creation/compute_from_bytes:large 87.0±0.46µs 87.0±0.40µs 0.00%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.01µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 93.0±0.58ns 92.4±0.61ns -0.65%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.2±0.23µs 65.8±0.36µs +0.92%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 954.6±4.71µs 981.3±6.85µs +2.80%
index_build/index_build/../test-assets/ 1038.0±6.71µs 1057.1±3.88µs +1.84%

Copy link

Benchmark for fd60a3b

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 249.9±1.02µs 250.0±1.31µs +0.04%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.07µs 15.8±0.14µs +1.28%
blake3_resource_id_creation/compute_from_bytes:small 1366.1±2.87ns 1373.3±9.42ns +0.53%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.4±1.54µs 196.9±0.81µs +0.25%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1741.6±3.75µs 1702.4±18.35µs -2.25%
crc32_resource_id_creation/compute_from_bytes:large 86.6±0.24µs 86.7±0.53µs +0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.18µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 93.0±0.42ns 92.4±0.38ns -0.65%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.7±0.40µs 65.9±0.18µs +0.30%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 955.6±14.93µs 978.1±3.95µs +2.35%
index_build/index_build/../test-assets/ 1041.4±13.33µs 1055.6±4.24µs +1.36%

@tareknaser
Copy link
Collaborator Author

could you investigate where does this output come from? We need to register a bug, I guess.

I have traced this down and apparently, ark-cli list --link would attempt to read all resources and print the content of readable files without validation.

Here's the relevant code https://github.com/ARK-Builders/ark-rust/blob/main/ark-cli/src/main.rs#L161-L170 :

EntryOutput::Link => match File::open(path) {
		Ok(mut file) => {
			let mut contents = String::new();
			match file.read_to_string(&mut contents) {
				Ok(_) => (None, None, Some(contents)),
				Err(_) => return None,
			}
		}
		Err(_) => return None,
	},

To address this, I have pushed a commit to check if the file content is a valid link before printing it. This is a bit of a workaround.

@kirillt
Copy link
Member

kirillt commented Jun 29, 2024

@tareknaser

To address this, I have pushed a commit to check if the file content is a valid link before printing it. This is a bit of a workaround.

If a follow-up issue is needed, feel free to register it. I think, we can merge the PR but please remove the last 2 FIXMEs.

@kirillt kirillt self-requested a review June 29, 2024 14:16
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
@tareknaser tareknaser force-pushed the organize_in_folders branch from 2ec4eeb to 02c76db Compare June 30, 2024 11:56
Copy link

Benchmark for edebd94

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 251.0±3.07µs 248.2±0.74µs -1.12%
blake3_resource_id_creation/compute_from_bytes:medium 15.5±0.04µs 15.8±0.13µs +1.94%
blake3_resource_id_creation/compute_from_bytes:small 1368.3±3.68ns 1386.9±3.64ns +1.36%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 197.6±0.71µs 198.2±2.85µs +0.30%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1742.9±12.14µs 1733.2±13.08µs -0.56%
crc32_resource_id_creation/compute_from_bytes:large 86.9±1.72µs 86.9±0.26µs 0.00%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.32ns 92.4±0.44ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.6±1.54µs 65.4±0.76µs -0.30%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 980.1±5.17µs 954.0±2.81µs -2.66%
index_build/index_build/../test-assets/ 1051.9±11.33µs 1043.8±3.59µs -0.77%

@tareknaser tareknaser merged commit a59e7ed into main Jun 30, 2024
4 checks passed
@tareknaser tareknaser deleted the organize_in_folders branch June 30, 2024 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants