Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add count_filter_fcds() that does counting and filtering together #84

Open
gadenbuie opened this issue Oct 2, 2019 · 1 comment
Open
Assignees
Labels
planning 🌱 Development planning

Comments

@gadenbuie
Copy link
Member

gadenbuie commented Oct 2, 2019

It's cumbersome to remember (and repeat) the grouping conditions that are required in order to complete zero-count groups when using count_fcds(). The complete_age_groups() function is correctly named but may be overly specific, especially in relation to the task at hand.

fcds %>% 
  filter(cancer_site_group == "Cervix Uteri") %>% 
  filter(sex == "Female") %>% 
  filter_age_groups(age_gt = 20) %>% 
  filter(county_name %in% fcds_const("moffitt_catchment")) %>% 
  count_fcds(race = TRUE, county_name, cancer_site_group) %>% 
  complete_age_groups(
    # required to know which age groups need to be completed
    age_gt = 20, 
    # Need to know the structure of the columns that need to be completed
    sex, race, county_name, cancer_site_group, 
    # Here's the tricky part: year_group and year vary together
    nesting(year_group, year)
  )

This is the very flexible workflow that ensures that any request can be completed. But it's also fairly common and can be abstracted into a single, one-shot function filter_count_fcds().

fcds %>% 
  filter_count_fcds(
    # Filters ....
    cancer_site_group == "Cervix Uteri",
    sex == "Female",
    county_name %in% fcds_const("moffitt_catchment"),
    # Arguments ....
    age_gt = 20,
    groups = c(race)
  )
  1. Arguments included in the filters are automatically included in groups.
  2. The groups argument allows counts broken down by additional columns across all values in that column.
  3. c("year_group", "year", "age_group") are still the default groups
  4. The default column structure when using just the FCDS data can be inferred from columns present in filters and groups. If unknown columns found then the function can bail early and recommend the manual workflow.
@gadenbuie gadenbuie self-assigned this Oct 2, 2019
@gadenbuie
Copy link
Member Author

The filters need to be in ... so that I can capture them with rlang::enexprs() but we can use a helper function for age bounds that mixes filter_age_groups() and recode_age_groups().

@gadenbuie gadenbuie added the planning 🌱 Development planning label Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
planning 🌱 Development planning
Projects
None yet
Development

No branches or pull requests

1 participant