Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LasDatasets.jl Package Discussion #44

Open
BenCurran98 opened this issue Jun 4, 2024 · 5 comments
Open

LasDatasets.jl Package Discussion #44

BenCurran98 opened this issue Jun 4, 2024 · 5 comments

Comments

@BenCurran98
Copy link

BenCurran98 commented Jun 4, 2024

Hi guys,

First of all, I wanted to say thanks for the amazing package!

For a bit of context, my team has been using LasIO historically quite a bit, until a year or two ago someone made a forked version to incorporate some v1.4 compatibility and application-specific stuff. Since the functionality here began to diverge from LasIO's current API/functionality, we then made it into a new package and used this internally. Recently, I managed to get permission to make this open source so that we could start collaborating and sharing with the wider community on this. The code for the new package, LAS.jl, is found here

For now I've kept it as a separate package since as I've mentioned it's got quite a different API and some different features (e.g. custom user fields, a LasPy-like interface for storing Julia structs as VLRs, etc.) and I don't want to step on any toes!

I'm opening this issue as a discussion on how we want to work together going forward and any ideas/concerns on your end. I think there are a couple of ways we can move forward:

  • We extract the functionality from LAS.jl and integrate it here as a major version update
  • I can make you collaborators on LAS.jl and we can start making modifications there and eventually shift to using that repo

Happy to discuss this in more detail here. I really appreciate the work you've all done on this package and the last thing I would want is to offend or get in anyone's way 🙂

@evetion
Copy link
Collaborator

evetion commented Jun 4, 2024

Hi @BenCurran98,

We noticed your talk at Juliacon and we look forward to seeing it, and meeting you in person at Juliacon! We're happy to see new features like vlrs, and really appreciate you reaching out. I'll let @visr answer on how we can move forward.

@visr
Copy link
Owner

visr commented Jun 5, 2024

Hi, yeah I just wanted to echo @evetion's remarks, happy to see this package has been useful for you. I'm currently not actively using this package, hence the inactivity here. But it makes sense to aim for one package unless there is a good reason to have two. We could for instance add a notice to LasIO referring to your package. I quickly looked a bit at LAS.jl, and have a few questions. Perhaps some should be moved to separate issues on LAS.jl

  • Have you also used https://github.com/evetion/LazIO.jl? It uses the LASzip shared library to avoid needing to write intermediate uncompressed files. Based on this code it looks like you are not using the cross-platform LASzip_jll but an executable checked in to the repository.
  • Since ArchGDAL is a rather large dependency that is only used for CRS information, it is worth looking into if that can be a package extension, or to use GeoFormatTypes.jl or Proj.jl instead here.
  • Similarly for TypedTables, can that be easily replaced by implementing the Tables interface, so users can choose whether they want TypedTables, DataFrames, or other table types?
  • You mention that LAS.jl started out as a fork of this package. Since this is not apparent in the git history, I'd appreciate a mention of this somewhere in the readme.

@BenCurran98
Copy link
Author

Hi, thanks for the response! I'll lodge issues in LAS.jl to implement these suggestions

  • In terms of LAZ functionality:
    At the moment we're just pointing to some pre-built binaries for LASzip to get it up and running. I've got a PR that's waiting to get merged in Yygdrasil that should provide the proper platform dependencies. I did also take a look at using LazIO.jl, however I found that with the current implementation we use under the hood in LAS.jl to merge points into a table, it turns out to be more performant (computationally, not sure about disk overhead though) to just decompress the whole thing and then read it in that way. Would definitely like to make this better at some point!

  • For dependency sizes I totally agree, I can look into some lighter options there

  • I was waiting to get in contact with the devs here regarding the acknowledgements, since I also wanted to check if you would like to be marked as contributors as well? Adding refs to the repos is easy though I'll do that now :)

@evetion
Copy link
Collaborator

evetion commented Jun 11, 2024

  • At the moment we're just pointing to some pre-built binaries for LASzip to get it up and running. I've got a PR that's waiting to get merged in Yygdrasil that should provide the proper platform dependencies. I did also take a look at using LazIO.jl, however I found that with the current implementation we use under the hood in LAS.jl to merge points into a table, it turns out to be more performant (computationally, not sure about disk overhead though) to just decompress the whole thing and then read it in that way. Would definitely like to make this better at some point!

The laszip build, now that it is on its own, seems like a good addition. Using the laszip executable was our initial approach many years ago, but our datasets kept growing, so we eventually build LazIO.jl.

For what it's worth, the modern approach imho, is to use a library like https://github.com/hobuinc/laz-perf (in the future maybe https://github.com/laz-rs/laz-rs), which is more mature and performant than the original lastools libraries. Ideally, one would use a COPC (cloud optimized point cloud) library built on top of it, like https://github.com/RockRobotic/copc-lib, which would give you indexing and streaming over the internet (plain impossible with the laszip library). The only drawback is that we have to write our own C library in between (or use https://github.com/eschnett/CxxInterface.jl) and build it on Yggdrasil as well, since both projects don't provide a C API.

@BenCurran98
Copy link
Author

  • At the moment we're just pointing to some pre-built binaries for LASzip to get it up and running. I've got a PR that's waiting to get merged in Yygdrasil that should provide the proper platform dependencies. I did also take a look at using LazIO.jl, however I found that with the current implementation we use under the hood in LAS.jl to merge points into a table, it turns out to be more performant (computationally, not sure about disk overhead though) to just decompress the whole thing and then read it in that way. Would definitely like to make this better at some point!

The laszip build, now that it is on its own, seems like a good addition. Using the laszip executable was our initial approach many years ago, but our datasets kept growing, so we eventually build LazIO.jl.

For what it's worth, the modern approach imho, is to use a library like https://github.com/hobuinc/laz-perf (in the future maybe https://github.com/laz-rs/laz-rs), which is more mature and performant than the original lastools libraries. Ideally, one would use a COPC (cloud optimized point cloud) library built on top of it, like https://github.com/RockRobotic/copc-lib, which would give you indexing and streaming over the internet (plain impossible with the laszip library). The only drawback is that we have to write our own C library in between (or use https://github.com/eschnett/CxxInterface.jl) and build it on Yggdrasil as well, since both projects don't provide a C API.

That's a really good idea! I hadn't heard of these libraries before but I agree that something that would reduce the overhead of decompressing and allowing streaming would be really cool to include. If I get some free time I'll research into this and maybe add this as an issue in the repo

@visr visr changed the title LAS.jl Package Discussion LasDatasets.jl Package Discussion Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants