`glenglat`: Global englacial temperature database

Download & Cite
Tutorial notebook
Version history
Tests

Our paper is currently in public review at Earth System Science Data. Preview it and consider reviewing it at https://essd.copernicus.org/preprints/essd-2024-249/!

Open-access database of englacial temperature measurements compiled from data submissions and published literature. It is developed on GitHub and published to Zenodo.

Data structure

The dataset adheres to the Frictionless Data Tabular Data Package specification. The metadata in datapackage.yaml describes, in detail, the contents of the tabular data files in the data folder:

source.csv: Description of each data source (either a personal communication or the reference to a published study).
borehole.csv: Description of each borehole (location, elevation, etc), linked to source.csv via source_id and less formally via source identifiers in notes.
profile.csv: Description of each profile (date, etc), linked to borehole.csv via borehole_id and to source.csv via source_id and less formally via source identifiers in notes.
measurement.csv: Description of each measurement (depth and temperature), linked to profile.csv via borehole_id and profile_id.

For boreholes with many profiles (e.g. from automated loggers), pairs of profile.csv and measurement.csv are stored separately in subfolders of data named {source.id}-{glacier}, where glacier is a simplified and kebab-cased version of the glacier name (e.g. flowers2022-little-kluane).

Supporting information

Folder sources contains subfolders (with names matching column source.id) with files that document how and from where the data was extracted. Files with a .png, .jpg, or .pdf extension are figures, tables, maps, or text from the publication. Pairs of files with .pgw and .{png|jpg}.aux.xml extensions georeference a .{png|jpg} image, and files with .geojson extension are the subsequently-extracted spatial coordinates. Files with an .xml extension document how numeric values were extracted from maps and figures using Plot Digitizer. Of these, digitized temperature profiles are named {borehole.id}_{profile.id}{suffix} where borehole.id and profile.id are either a single value or a hyphenated range (e.g. 1-8). Those without the optional suffix use temperature and depth as axis names. Those with a suffix are unusual cases which, for example, may be part of a series (e.g. _lower) or use a non-standard axis (e.g. _date).

The repository's license does not extend to figures, tables, maps, or text extracted from publications. These are included in the sources folder for transparency and reproducibility.

Submitter guide

To submit data, send an email to jacquemart@vaw.baug.ethz.ch. Please structure your data as either comma-separated values (CSV) files (borehole.csv and measurement.csv) or as an Excel file (with sheets borehole and measurement). The required and optional columns for each table are described below and in the submission metadata: submission/datapackage.yaml. Consider using our handy Excel template: submission/template.xlsx!

Note: We also welcome submissions of data that have already been digitized, as they allow us to assess the accuracy of the digitization process.

`borehole`

name	description	type	constraints
`id`	Unique identifier.	integer	required: True unique: True minimum: 1
`glacier_name`	Glacier or ice cap name (as reported).	string	required: True pattern: `[^\s]+( [^\s]+)*`
`glims_id`	Global Land Ice Measurements from Space (GLIMS) glacier identifier.	string	pattern: `G[0-9]{6}E[0-9]{5}[NS]`
`latitude`	Latitude (EPSG 4326).	number [degree]	required: True minimum: -90 maximum: 90
`longitude`	Longitude (EPSG 4326).	number [degree]	required: True minimum: -180 maximum: 180
`elevation`	Elevation above sea level.	number [m]	required: True maximum: 9999.0
`mass_balance_area`	Mass balance area. - ablation: Ablation area - equilibrium: Near the equilibrium line - accumulation: Accumulation area	string	enum: ['ablation', 'equilibrium', 'accumulation']
`label`	Borehole name (e.g. as labeled on a plot).	string
`date_min`	Begin date of drilling, or if not known precisely, the first possible date (e.g. 2019 → 2019-01-01).	date	format: `%Y-%m-%d`
`date_max`	End date of drilling, or if not known precisely, the last possible date (e.g. 2019 → 2019-12-31).	date	format: `%Y-%m-%d`
`drill_method`	Drilling method. - mechanical: Push, percussion, rotary - thermal: Hot point, electrothermal, steam - combined: Mechanical and thermal	string	enum: ['mechanical', 'thermal', 'combined']
`ice_depth`	Starting depth of ice. Infinity (INF) indicates that ice was not reached.	number [m]
`depth`	Total borehole depth (not including drilling in the underlying bed).	number [m]
`to_bed`	Whether the borehole reached the glacier bed.	boolean
`temperature_accuracy`	Thermistor accuracy or precision (as reported). Typically understood to represent one standard deviation.	number [°C]
`notes`	Additional remarks about the study site, the borehole, or the measurements therein. Literature references should be formatted as {url} or {author} {year} ({url}).	string	pattern: `[^\s]+( [^\s]+)*`

`measurement`

name	description	type	constraints
`borehole_id`	Borehole identifier.	integer	required: True
`depth`	Depth below the glacier surface.	number [m]	required: True
`temperature`	Temperature.	number [°C]	required: True
`date_min`	Measurement date, or if not known precisely, the first possible date (e.g. 2019 → 2019-01-01).	date	format: `%Y-%m-%d`
`date_max`	Measurement date, or if not known precisely, the last possible date (e.g. 2019 → 2019-12-31).	date	format: `%Y-%m-%d` required: True
`time`	Measurement time.	time	format: `%H:%M:%S`
`utc_offset`	Time offset relative to Coordinated Universal Time (UTC).	number [h]
`equilibrated`	Whether temperatures have equilibrated following drilling.	boolean

Validation

You can validate your CSV files (borehole.csv and measurement.csv) before submitting them using the frictionless Python package.

Clone this repository.

git clone https://github.com/mjacqu/glenglat.git
cd glenglat

Either install the glenglat-submission Python environment (with conda):
```
conda env create --file submission/environment.yaml
conda activate glenglat-submission
```
Or install frictionless into an existing environment (with pip):
```
pip install "frictionless~=5.13"
```
Validate, fix any reported issues, and rejoice! (path/to/csvs is the folder containing your CSV files)
```
python submission/validate.py path/to/csvs
```

Developer guide

Install dependencies

Clone this repository.

git clone https://github.com/mjacqu/glenglat
cd glenglat

Install the glenglat Python environment with conda (or the faster mamba):

conda env create --file environment.yaml
conda activate glenglat

or update it if it already exists:

conda env update --file environment.yaml
conda activate glenglat

Copy .env.example to .env and set the (optional) environment variables.

cp .env.example .env

GLIMS_PATH: Path to a GeoParquet file of glacier outlines from the GLIMS dataset with columns geometry (glacier outline) and glac_id (glacier id).
ZENODO_SANDBOX_ACCESS_TOKEN: Access token for the Zenodo Sandbox (for testing). Register an account (if needed), then navigate to Account > Settings > Applications > Personal access tokens > New token and select scopes deposit:actions and deposit:write.
ZENODO_ACCESS_TOKEN: Access token for Zenodo. Follow the same steps as above, but on the real Zenodo.

Run tests

Run the basic (frictionless) tests.

frictionless validate datapackage.yaml

Run the custom (pytest) tests in the tests folder.

pytest

An optional test checks that borehole.glims_id is consistent with borehole coordinates. To run, install geopandas and pyarrow and set the GLIMS_PATH environment variable before calling pytest.

conda install -c conda-forge geopandas=0.13 pyarrow
pytest

Update submission instructions

The glenglat.py module contains functions used to maintain the repository. They can be run from the command line as python glenglat.py {function}.

To update all generated submission instructions:

python glenglat.py write_submission

This executes several functions:

write_submission_yaml: Builds submission/datapackage.yaml from datapackage.yaml.
write_submission_md: Updates tables in this README.md from submission/datapackage.yaml.
write_submission_xlsx: Builds submission/template.xlsx from submission/datapackage.yaml.

Publish to Zenodo

The zenodo.py module contains functions used to prepare and publish the data to Zenodo. They can be run from the command line as python zenodo.py {function}.

To publish (as a draft) to the Zenodo Sandbox, set the ZENODO_SANDBOX_ACCESS_TOKEN environment variable and run:

python zenodo.py publish_to_zenodo

To publish (as a draft) to Zenodo, set the ZENODO_ACCESS_TOKEN environment variable, run the same command with --sandbox False, and follow the instructions. It will first check that the repository is on the main branch, has no uncommitted changes, that all tests pass, and that a commit has not already been tagged with the current datapackage version (function is_repo_publishable).

python zenodo.py publish_to_zenodo --sandbox False

The publish process executes several functions:

build_metadata_as_json: Builds a final build/datapackage.json from datapackage.yaml with filled placeholders for id (doi), created (timestamp), and temporalCoverage (measurement date range).
build_zenodo_readme: Builds build/README.md from datapackage.yaml.
build_for_zenodo: Builds a glenglat release as build/glenglat-v{version}.zip from new build/datapackage.json and build/README.md (see above), and unchanged LICENSE.md and data/. The zip archive is extracted to build/glenglat-v{version} for review.
render_zenodo_metadata: Prepares a metadata dictionary for upload to Zenodo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`glenglat`: Global englacial temperature database

Data structure

Supporting information

Submitter guide

`borehole`

`measurement`

Validation

Developer guide

Install dependencies

Run tests

Update submission instructions

Publish to Zenodo

About

Releases 3

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 567 Commits
.github/workflows		.github/workflows
data		data
sources		sources
submission		submission
templates		templates
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE.md		LICENSE.md
README.md		README.md
datapackage.yaml		datapackage.yaml
environment.yaml		environment.yaml
glenglat.py		glenglat.py
tutorial.ipynb		tutorial.ipynb
zenodo.py		zenodo.py

License

mjacqu/glenglat

Folders and files

Latest commit

History

Repository files navigation

glenglat: Global englacial temperature database

Data structure

Supporting information

Submitter guide

borehole

measurement

Validation

Developer guide

Install dependencies

Run tests

Update submission instructions

Publish to Zenodo

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

`glenglat`: Global englacial temperature database

`borehole`

`measurement`

Packages