This is a repository that containers the files for the official US RSE community website hosted at https://us-rse.org. The site is built with Jekyll and hosted on GitHub.
We encourage the community to contribute to the content of the website.
To do this: fork the repository, make your proposed changes, test locally (see below), and then create a pull request against master
. For more details about opening pull requests and issues, see our Contributing Guide.
The map is generated programmatically from the US-RSE member list, so if you have already joined and provided your institution, you should be represented on it. If you see any issues or errors with location lookup (we use geolocation of a named location) please open an issue.
We maintain a list of current and previous job postings in _data/jobs.yml. Specifically, we ask that you provide a name, location (can be Remote), an expiration date, and a url to the posting. The expiration date is not shown on the page, however it will determine when the job doesn't appear anymore. We suggest setting a timeframe such as a month, and if you want to extend it, you can open a pull request to update the date. An example posting is shown below. This job will appear on the site until the first of July, 2019.
- {expires: 2019-07-01, location: 'Princeton, NJ', name: 'Research Software Engineer',
url: 'https://main-princeton.icims.com/jobs'}
We will test that all fields are defined, the url exists, and that the "expires" field loads
as a datetime.date
object in Python. If you copy the format above, you should
be ok.
You can add an event or training to the site by adding a markdown file in the _events
folder, organized by year. Do not use the full date (e.g. YYYY-MM-DD-.md) in the file name,
Jekyll will not post pages that it interprets to have a future date in the filename. A better option is
to use a partial date (e.g. YYYY-MM-.md).
Here is an example of a file in _events/2019
for PEARC19:
---
title: PEARC19
location: Chicago, IL
url: https://www.pearc19.pearc.org/
expires: 2019-08-01
event_date: "November 17β22, 2019"
layout: event
repeated: false
---
Join us at [PEARC19](https://www.pearc19.pearc.org/) for a Birds of a Feather (BOF) session "Building a Community of Research Software Engineers." Our session is scheduled for 5:15 PM on Monday, July 29.
The top section is frontend matter that must include the title, location, url, layout as "event" event date, an expiration date, and a "repeated" variable (true or false). Notice that the event date is a string that doesn't get parsed, while the expires must be a date in the format shown. The bottom section (the content) you can write any amount and length of markdown that is desired. When the event is active (before expiration) the full content will be shown on the "Events and Training" page. Once it expires, it will move into the events archive. In both cases, clicking on the Event will take the viewer to it's page, and they can view additional content and the url provided. In the case of the archive, the bulk of content is only viewable on this page.
You'll notice that there is a folder called "repeated" in the events folder:
$ ls _events/
2019 2020 repeated
A repeated event is one that happens weekly, monthly, or on a regularly scheduled basis that typically does not change, meaning that you wouldn't need to update the post. A weekly call that has a description and a consistent link to an agenda would be appropriate, while the same call that varies in schedule or requires an updated description would not quality. An annual event, or one that would require a different description, would not be repeated, and should be placed in a folder named by date. Repeated events are always shown at the top of the events page, and do not expire.
We have a special header field that you can define if you want a page to redirect
elsewhere. We do this by way of a meta tag, and we give the viewer 2 seconds
to see a message that they are being redirected. To keep these pages
organized, we have them located in the redirects
folder:
$ ls pages/redirects/
2020-april-workshop.md
And the header front end matter should look like the following:
---
layout: page
title: US-RSE Community Building Workshop
permalink: /2020-april-workshop/
redirect: https://us-rse.org/first-community-workshop
---
The above says that the page titled "US-RSE Community Building Workshop" served at permalink /2020-april-workshop will be redirected to https://us-rse.org/first-community-workshop.
Tests are run during continuous integration to catch any errors and to preview content. Specifically, usrse.github.io uses the following integrations (with links to configuration files):
- CircleCI previews the site, and tests jobs and mapdata
- GitHub CI includes GitHub triggers and actions
Instructions for running locally, along with details about each, are provided below.
CircleCI is the primary means to preview a pull request, as the site is built and available for preview as an artifact. Additionally, the jobs and map data is tested (details below). There are no credentials or secrets required for this setup, other than the repository needing to be connected to CircleCI, and under settings:
- build forked pull requests should be on
- cancel redundant builds is suggested
- workflows should be enabled
If you want to edit any of the tests, you should edit config.yml. Details about running tests locally are included below. This can be good to do if you change an input file in _data and want to test it.
Jobs are tested for correctness, meaning that all fields are entered, a date string is entered for the "expires" field, and the url is valid. You can run tests locally like:
$ cd tests
$ python -m unittest test_jobs
A script is provided that will clone the repository to a temporary directory, find all commits with a changed job file, and then checkout and read each commit to get the jobs present for that time. We then use the title and url for the job as a unique identifier to determine if the job has been seen. A job with the same name and url, and thus the same unique identifier, is considered the same job. You can run this script as is if you just want to derive counts:
$ python scripts/count_jobs.py
Cloning repository https://github.com/USRSE/usrse.github.io
Found 43 commits for _data/jobs.yml
Found a total of 35 unique jobs across 43 commits.
or you can add an output file to save the compiled job content to file
$ python scripts/count_jobs.py all-jobs.yml
Cloning repository https://github.com/USRSE/usrse.github.io
Found 43 commits for _data/jobs.yml
Found a total of 35 unique jobs across 43 commits.
Saving to output file /home/vanessa/Desktop/Code/usrse/usrse.github.io/all-jobs.yml
The repository is always cleaned up, and the parsing done separately from the script.
To preview the site on CircleCI, after it finishes building, make sure you are logged in and following the repository, and then click on the "Artifacts" tab. You can select the static file to open and preview in your browser.
To preview the site locally, you'll need to install jekyll It's then typical to go to the root of the site and issue (just once):
$ bundle install
And then (also in the top level directory of your forked repository) run
$ jekyll serve
# or
$ bundle exec jekyll serve
and open your browser to http://localhost:4000.
If you are having trouble try rm -rf _site
, followed by bundle update
, then bundle exec jekyll serve
.
A legacy Rakefile is kept with the repository to allow for a manual rake test
to use the html-proofer to check links.
This was previously deployed on TravisCI, however it was very buggy and failed often since the checker had no concept of retry. While the travis instruction has since been removed, you can look at the old configuration file here in a previous commit. To run this previous test locally on your own you can do:
$ rake test
This has been replaced by the "URLChecker" in GitHub CI, which does have retry and other nice features to make it less error prone, discussed next.
The URLschecker is a GitHub action that @vsoch worked on to contribute retry and some other nice features for the repository here. These features are available as of version 0.1.6 that is used in the workflow.
The workflow clean-expired-jobs.yml is run nightly, and uses the same function from the urlchecker to check for expired links in jobs.yml, and given an expired link, remove it from the file if the url check fails. In the case that a link is not expired and the check fails, we would want to know about this (and the test will fail).
This simple greetings action greets first time users (for issues). The logic of this is determined by the greetings.yml workflow.
Two scripts help to create a branch with an updated member counts file
that starts with the prefix update/member-counts
. The workflow member-counts.yaml will generate an updated file and commit and push to a new branch, and it uses pull-request.sh to then open a PR with the new branch to the repository. For GitHub CI, there are currently no secrets or credentials, and no setup is required - having actions enabled for the repository and placing the file under .github/workflows
enables it.
Why do we use different services?
Using multiple "free tier" CI services is a common thing for open source projects to do. There are several reasons to do this:
- we can better leverage a free tier, meaning a maximum number of jobs run in parallel or minutes per month by spreading work over multiple services.
- we can scope a particular kind of test to a service. For example, one service might just be to test the core software, another might be to build and deploy containers, and a third might be to preview a site.
- each CI service offers unique features. For example, GitHub has the closets integration with the repository here, and CircleCI allows us to preview artifacts.
We use the all-contributors tool to generate a contributors graphic below.