Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geopackage read bugfix #838

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

JoshCu
Copy link
Contributor

@JoshCu JoshCu commented Sep 3, 2024

Fixes regex matching both flowpaths and flowpath_attributes to flowpath

( specifically geopackages with both flowpaths and flowpath_attributes tables )
This adds a check for an end of sequence character after flowpath(s) so that flowpaths_attributes doesn't get mapped to flowpaths

Additionally adds logic to prevent the error that would occur from attempting to merge an empty dataframe

separated out from #833 as this is a bugfix and that is a larger fundamental change.

@AminTorabi-NOAA
Copy link
Contributor

Can you also make a PR for #833. I think it was great idea using sqlite, it's much faster using that

@jameshalgren
Copy link
Contributor

jameshalgren commented Sep 4, 2024

Can you also make a PR for #833. I think it was great idea using sqlite, it's much faster using that

#839 takes a step towards #833 with not as many performance gains but a far, far simpler solution and much easier maintenance path. Depending on the environment and the default behavior of geopandas to use fiona or pyogrio, the performance improvement may vary. In anycase, the pyogrio library should be more performant for the simple case of opening the geopackage to read the list of layers.

Note that #839 will fail the CI tests until rebased onto this PR...

In the meantime, the sqlite3 option is still out there for later refactoring #604 :)

@AminTorabi-NOAA
Copy link
Contributor

Can you also make a PR for #833. I think it was great idea using sqlite, it's much faster using that

#839 takes a step towards #833 with not as many performance gains but a far, far simpler solution and much easier maintenance path. Depending on the environment and the default behavior of geopandas to use fiona or pyogrio, the performance improvement may vary. In anycase, the pyogrio library should be more performant for the simple case of opening the geopackage to read the list of layers.

In the meantime, the sqlite3 option is still out there for later refactoring #604 :)

I made a PR with pyogrio that significantly improves the speed. However, we're restricted in the packages we can use since they need to be installed via pip on WCOSS. I'm following up to see if we can get approval to use that package. The good thing about SQLite, though, is that it comes with Python and doesn't require a pip install

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants