Skip to content

Latest commit

 

History

History
296 lines (230 loc) · 26.3 KB

onboarding.md

File metadata and controls

296 lines (230 loc) · 26.3 KB

Greene Laboratory Onboarding Information

Mission Statement

We view our core purpose as the development of methodological advances and integrative systems that make analysis of big data, particularly gene expression data, as routine in wet-bench biology labs as PCR. To accomplish this, we will write good code, perform solid and reproducible analyses, and disseminate our results widely through approachable publications and webservers. We recognize that trust, both in the process and in our results, is of primary importance to the biologists that use our methods and webservers. Therefore, we strive to make our source code as open and accessible as possible. When we submit papers, we expect that the analytical code behind those papers will be something that we can be proud of. To these ends, we will provide reviewers and the scientific community with all source code required to generate figures in the paper that result from computational analyses.

Expectations

Your role: We expect that you will take primary responsibility for the success of your research project and career development. As a member of the lab, you are expected to participate fully in the team. In general, lab members are expected to follow weekday working hours that include 9:30AM to 4:00PM in their local timezone to facilitate discussion within the group. We will aim to schedule meetings that respect these hours to the extent possible between the Eastern and Pacific timezones, recognizing that certain meetings (weekly kick-off and demo day) may fall outside of these hours in certain timezones. In no cases should recurring meetings be scheduled that routinely fall outside of 8:00AM to 5:00PM for any lab member working within the continental United States. The Department of Biomedical Informatics at CU, which our lab is part of, provides assigned workspaces for those on campus three or more days per week. Those on-site fewer than three days per week use hotelling spaces. Any postdoc or staff member working remotely for any number of days per week is expected to have an approved Remote Work Participation Agreement in accordance with campus policies.

Casey's role: Casey’s goal is to facilitate your success as well as that of your project. Within your project, Casey will serve as a sounding board for ideas, will help you plan your project, and will help to devise experiments to test your hypotheses. To facilitate your success, Casey will help you to plan your training, to devise a career plan that can take you to where you want to go, to advise you on your project-risk portfolio, and to provide guidance on other elements of career and project development as needed.

Deadlines: Our lab has worked hard to develop a reputation for high-quality science that is well presented. We all benefit from this reputation, but we must also work to maintain it. Abstracts for meetings must be shared with all co-authors, including Casey, at least one week prior to the deadline for submission. Failure to abide by this guideline will result in missing whatever the opportunity in question is.

Trainees in the lab will often receive opportunities to present their work at scientific conferences. These presentations reflect on the entire lab. Oral presentations on projects must be presented to the research lab during a braintrust meeting, and lab members are expected to address feedback that is provided. Once a project has been presented once and feedback incorporated, additional presentations are optional in advance of meetings. Poster presentations should be shared in the #general slack channel at least a week before printing.

Code of Conduct: All members of the lab, along with visitors, are expected to agree with this code of conduct. We will enforce this code. We expect cooperation from all members to help ensuring a safe environment for everybody. The lab is dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, or religion (or lack thereof). We do not tolerate harassment of lab members in any form. Sexual language and imagery is generally not appropriate for any lab venue, including lab meetings, presentations, or discussions. However, do note that we work on biological matters so work-related discussions of e.g. animal reproduction are appropriate. Harassment includes offensive verbal comments related to gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, religion, sexual images in public spaces, deliberate intimidation, stalking, following, harassing photography or recording, sustained disruption of talks or other events, inappropriate physical contact, and unwelcome sexual attention. Members asked to stop any harassing behavior are expected to comply immediately.

If you are being harassed, notice that someone else is being harassed, or have any other concerns, please contact Casey Greene immediately. If Casey is the cause of your concern, Dr. Deborah Hogan (Deborah.A.Hogan@dartmouth.edu) is a good informal point of contact; she does not work for Casey and has agreed to mediate. For additional resolution paths, please see the University of Colorado Anschutz ombuds or professionalism offices. The code of conduct section is licensed under a Creative Commons Attribution 3.0 Unported License. http://2012.jsconf.us/#/about & The Ada Initiative. Please help by translating or improving: http://github.com/leftlogic/confcodeofconduct.com.

We expect members to follow these guidelines at any lab-related event.

Authorship: Our lab follows the ICMJE's Uniform Requirements for Manuscripts Submitted to Biomedical Journals defintions of the roles of authors and contributors to our manuscripts.

Ethics: We expect lab members to be honest in scientific communications both within and outside the lab. We expect that lab members will design experiments in a manner that minimizes both bias and self deception. We expect that lab members will keep agreements, be careful, and share their code and results openly with the scientific community. We expect that credit will be given where credit is due, including in scientific writing. Plagiarism is not tolerated. While a full enumeration of ethical considerations is outside of the scope of this document, CU provides a document that we recommend.

In addition, please don't hesitate to raise any questions or concerns that you have at any point with Casey.

PhD Student Committees: PhD students will interact with their qualifier and thesis committees. Students should correspond with the coordinator for their graduate program to understand the expectations that exist around communication with committee members. Questions around what document(s) their committee will expect to see and when they should be sent to the committee should be resolved with the coordinator at least a month in advance of a scheduled meeting. Students in the Greene lab are not to provide food or drinks for committee members. If the students are in a graduate program where a culture of providing food and drinks to committee members has developed, the students can include the information that no food or drink will be provided on an email in advance of the meeting and cite this policy.

Conference Travel: We try to make sure that each member of the lab can travel to one conference per year of their choice outside of their home region. The conference should be within the continental United States or cost competitive with similar conferences in the continental US. Lab members who travel to such a conference should submit an abstract for an oral presentation and poster and should present in whatever form is accepted at the meeting. The conference should be topical for the lab member's research projects and the purpose must align with the grant(s) that support the lab member. Lab members should first clear such travel with Casey. Lab members who are invited to conferences or other presentation opportunities with their costs covered by the organization inviting them, e.g., as an invited speaker or keynote, are welcome to accept such invitations. In all cases conference travel should be noted on the lab attendance calendar.

Communication

General

Slack: We use slack for rapid communication within the lab. If you plan on sending an e-mail to someone within the lab, try a slack message instead. This helps to keep communications in one place, and Casey commits to respond to slacks (not necessarily immediately, but the same guarantee is not made for e-mail). There are many channels on our lab's slack; however, it is recommended that newcomers join the following channels: #general, #lab-meeting, #journalclub, #random, #wins.

HeyTaco: We recognize that people regularly go above and beyond lab expectations. We wanted a way to recognize each other when this happens. We now use HeyTaco. This allows lab members to send a quick virtual thank you note and/or pat on the back. If someone’s paper gets accepted or someone helps you out with a programming question, congratulate or thank them. Post a message that mentions any user in the #wins Slack channel, and they'll get a HeyTaco point. When one member accumulates enough points, they take the lab out to lunch (Casey pays).

Social Media: Lab members are encouraged to communicate through public social media. If a lab member chooses to do so on an account that notes an affiliation with the lab, the lab member is expected to follow our code of conduct. Certain employers may require a disclaimer that the lab member's views do not represent those of the employer.

Projects: By the nature of our research, lab members will often have the opportunity to participate in projects managed via private or publicly accessible source code repositories. In these cases, lab members are expected to: follow the code of conduct; expect that private repositories will be world accessible; and to communicate via the project-specific medium For example, when one lab member reports an issue on a project on GitHub, it would not be appropriate for another lab member to reply "I'll drop by your desk and show you how to solve that." It would be most appropriate for the conversation to take place on GitHub issues.

IP/Openness: This is handled in accordance with the instructions from our research sponsors and university guidance. Lab members must follow the Penn Participation Agreement (for Penn-affiliated lab members) or University of Colorado | Anschutz policies (for CU-affiliated lab members) and the agreements with our sponsors. These often allow, encourage, or require openness. If you have concerns at any point, set up a meeting with Casey to discuss these concerns.

Space: Space is assigned in accordance with the Department of Biomedical Informatics Space Policies. For individuals meeting the criteria for assigned space, a lab member needs to fill out the DBMI Space Request Form.

Calendars: There are two Google Calendars for the lab: Greene Lab Core Events (webview, Calendar ID h1eia9g7qu1udm079vsav7qlq0) and Greene Lab Attendance (webview, Calendar ID dk2vdln8ci4mh1m723df6rcb3s). The Attendance calendar is for noting individual availability (i.e. whether you'll be out of office). It should be used, for example, to note vacations, conference travel, and other workday conflicts. All other events should go in the Core Events calendar. In general, this calendar is for events that could possibly involve 3 or more lab members. Mandatory events such as lab meetings, scrums, and group deadlines go on Core Events.

Accounts: Lab members are expected to have accounts for the following and be members of the specified (organizations) if applicable:

  • GitHub (greenelab)
  • Google Calender (Shared Calendar)
  • Slack (GreeneLab)

Meetings

Scrum: Our team's scrum process involves three components:

1. A weekly kick-off meeting at 10:30 AM ET / 8:30 AM MT Monday morning where individuals will lay out their goals for the week on zoom.
2. A demo day meeting at 3:30 PM ET / 1:30 PM MT Friday afternoon where team members show off an accomplishment from the week in 3 minutes or fewer. This could be a new figure, section of a paper, some code that they are particularly happy with, or something that we learned from a paper, poster, or research presentation.
3. A daily virtual scrum update.

We use Google Slides to share figures or paper sections for the demo day meeting; the link is pinned in the #lab-meeting channel on Slack and is not intended to be shared outside the lab. If you are not already listed in the slides, feel free to add a slide with your name. Alternatively, screen sharing is possible for code demos or interactive weekly accomplishments.

The daily virtual scrum update should include an update to scrum repository.

An issue is supposed to be automatically created for each day the office is open. These issues can be found here. The update should include the following:

1. What specific item(s) they accomplished yesterday.
2. What specific item(s) they plans to accomplish today.
3. Who, if anyone, is blocking them?
4. Who, if anyone, are they blocking?

Lab Meeting:

  • Lab meetings are scheduled for one hour on Wednesdays. All members of the Greene lab are expected to attend if possible. Meetings are expected to be a supportive environment for learning, constructive criticism, help, and scientific discussions.

    Meeting Agenda: The format of each meeting will be chosen by the lab meeting lead from the options below. The lead will rotate among lab members (see below) within the group. Guests with aligned research interests may join with a supermajority vote of lab members (>2/3) and are expected to attend and participate fully.

    The lead will choose the format for each meeting. The different options for meeting formats are outlined below. Each member is expected to lead at least two Braintrust meetings per year (one every 6 months).

    • Format 1: Braintrust
      • The meeting lead presents their own research/project to the group. Presenters often focus on open questions or challenges in their work. Occasionally, they present a new talk or set of slides that they intend to deliver at a meeting, job talk, etc. This is a way for the group to get familiar with each other’s work. It is also a good way to get feedback, advice, or help with research if needed.
    • Format 2: Tech Talk/Discussion
      • Talks on commonly used tech in the labs, or strategies for staying on top of the literature, organization, etc.
    • Format 3: Post-Conference Presentations
      • Journal club talk on favorite poster/talk. Either from each person or from a selected set that the group votes on.
    • Format 4: Big ideas or Projects
      • This format is meant to help senior members practice for paper discussion sections/conclusions while helping newer members see where the boundaries of fields are.
    • Format 5: Journal Club
      • Presentation to be given by meeting leader followed by group discussion. The meeting leader should aim to send the chosen paper one week before the scheduled meeting, and lab members are expected to be familiar with the content for discussion.
    • Format 6: Preprint Review
      • Pre-print is discussed by the group. The discussion is led by the meeting leader, and all members are expected to be familiar with the content. The review is written collaboratively, but another member (not the meeting leader) formats, formalizes and uploads to the pre-print server as a comment.

    Lab Meeting Lead: A member of the Greene lab is expected to present/lead each lab meeting as scheduled in the lab meeting calendar (found [https://docs.google.com/document/d/12mcW_1PDqmoli6W-1Mn9o0bCNHOl4Ym0yUzp2NFxBlM/edit])

    • Each member is expected to sign up as lead by the end of the previous academic semester (e.g., sign up by December for the Spring semester, sign up by June for the Fall semester.)
    • Lab meeting sign-ups will happen twice a year. Sign-ups will open in December and June for the upcoming 6 months.
    • Each member will sign up as lead for at least 2 lab meetings each semester (6 months). At least one of these meetings is expected to be a Braintrust meeting.

    Ad-Hoc Meetings: After meetings have been scheduled for the semester, any member can add an ad hoc meeting as needed.

    • Ad hoc meetings are meant to help lab members get advice and help on projects, prepare for talks, oral exams, etc.

Individual Meetings: We schedule weekly individual meetings. Once you join the lab, contact Casey and Michelle to set up a time. These are set up for a term to accommodate class schedules. We don’t reschedule these meetings by default if one of the parties (Casey or you) are out of town, so if you do want to meet in a week but travel conflicts, contact Casey and Michelle to reschedule. The goal of the weekly meeting is to:

  1. Discuss challenges.
  2. Plan strategy (project related, personal career, etc).

Triannual Self Reflection: Every four months students, postdocs, and staff will individually meet with Casey to discuss their existing goals, current progress made and set goals for the next interval. To prepare for these meetings students and staff are required to create an activity report that contains any of the following information (if applicable):

  • publications: submitted/accepted/published
  • grants/fellowships/scholarships: applied/awarded
  • presentations delivered
  • posters presented
  • meeting abstracts: submitted/accepted
  • software releases
  • other honors
  • goals for next session: What would you like to accomplish by the end of next cycle?
  • self-reflection. What do you regard as your strengths and as areas where you need improvement?

The report should be in the form of a plain text file, markdown file, or PDF and the file should be called lastname-reflection-yearmonth (e.g. Greene-reflection-201908.txt). Submit the report in a direct message to Casey via slack. During the summer, graduate students are requried to complete Penn's individual development plan (IDP). Post-docs must complete an IDP prior to their annual contract renewal. This document covers more in-depth content than the regular triannual self reflection; therefore, the IDP can be used as a replacement annual report for that cycle. Because much of the material is overlapping, trainees will benefit from preserving their self reflection materials in a format that supports copying and pasting to the IDP form.

Source Code, Data, and Reproducibility

Pride: We expect lab members to sign their code, which means that source code contributions are attributable to an individual's account on GitHub. To quote from The Pragmatic Programmer:

Craftsmen of an earlier age were proud to sign their work. You should be, too… People should see your name on a piece of code and expect it to be solid, well written, tested, and documented.

While some code will be proof-of-concept code, it should be of a form that inspires confidence.

Language: We write code for our analyses in Python or R, which allows everyone in the lab to know two languages and understand analytical code. Code for visualization can be Python, R, or javascript. Webserver interface code uses javascript.

Licensing: We release as many research outputs as possible under permissive open licenses. This ensures lab research is reusable and reproducible, with minimal legal barriers. The default license for software that should be applied to new lab related repositories is the BSD-2-Clause Plus Patent License. This license is OSI-approved and rated highly for its simplicity, compatability, and effectiveness.

In certain cases, a funding agency requires a different license or upstream restrictions require certain licensing. In these cases, the lab may apply a different license. If you have questions or concerns about licensing, feel free to raise them in Slack.

Version Control Services: Our primary version control service is GitHub, and we have a greenelab account there. We expect that lab members will maintain their code in repositories under these team accounts. However, lab member should not commit to the branch that is shown as default on GitHub for any of these repositories. Instead commits happen as described below to facilitate code review.

Creating a Greenelab Repository:

  1. Create a repository under the team account.
  2. Immediately fork this repository into one that your user account owns.
  3. Make commits to your own repository, and move code back to the greenelab repository as described below.

Getting Code into Greenelab Repositories: Code moves from user repositories to greenelab repositories through a process of code review. Code review is handled through pull requests. The process is described briefly below. Feel free to ask for guidance if you are uncomfortable with the process. We will revoke write access for failing to adhere to these rules.

  1. Make changes to your code and commit them in your own repository first.
  2. Create a pull request into the repository owned by Greenelab.
  3. Name potential reviewers for your pull request.
  4. Once at least one lab member has approved your pull request, you or a reviewer may merge your pull request. The only exception to this policy is this repository ("onboarding") where, in addition to the above rules, Casey must also approve the pull request.

Composition of Pull Requests: Each pull request may contain one or more changesets. In keeping with good source control practice, each changeset or commit should contain all changes necessary for a particular fix or update. In addition, each pull request should relate to no more than one functional area in the code base you are updating. Keeping the pull request focused to one area makes it easier for your reviewers to provide thoughtful feedback.

Reviewing Pull Requests: We expect that all lab members will participate in review of pull requests. If you get named by the submitter, it's courteous to review the request. We have created a checklist to facilitate review. As a reviewer, you are responsible for making sure that all checklist guidelines are followed.

Projects that didn't work: We expect that repositories will contain failures (e.g. proof-of-concepts that didn't work). This is ideal. Being able to find them will make sure we don't make the same failure twice.

Non-Code Versioning: Non-code documents should be kept in a place that maintains version history. Penn provides Box for these purposes. The University of Colorado provides OneDrive for this purpose.

Data Management: For publicly available data, scripts used to download and process these data should be preserved, as should the versions of items used in processing (e.g. probe to gene mappings). These items should be version controlled. Where possible, intermediate files of reasonable size can be stored to facilitate re-use, but the process to regenerate these files from publicly available data should be preserved. When we generate data, they should be stored in a location where they are replicated and uploaded to the relevant database as soon as possible (e.g. GEO for gene expression, SRA for sequencing).

Reproducibility: We expect all lab members to maintain code that performs reproducible analyses. This can be in the form of makefiles, shell scripts, or other automation approaches that allow analyses to be automatically performed. We expect that these scripts, including those to generate figures in papers generated as a consequence of such analyses, will be included in source control repositories (see "Getting Code into Greenelab Repositories") and made publicly available before or concurrent with the submission of preprint (if submitted) or manuscripts. Combined with the review guidelines, this means that all code must have been reviewed for these documents to be submitted.

How to Modify this Document

This is a living document. The repository is at GitHub. To make changes, fork, edit the files you wish, and create a pull request. The pull request process is handled as described in the Getting Code into Greenelab Repositories section of coding_and_software.

Additional Resources