Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit primer pairs in penalty order. #87

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

tfenne
Copy link
Member

@tfenne tfenne commented Nov 14, 2024

This PR does two things:

  1. Fix a bug where amplicons that were too small were still emitted
  2. Ensure that primer pairs are emitted in penalty order

(2) requires materializing a small tuple for all valid pairs, and then sorting by score. The tuple contains two ints (indices into the primer sequences) and two floats (the penalty and the tm, the last for convenience so we don't have to recompute it). It the sorts the tuples by penalty, and starts generating PrimerPairs in penalty order.

If you have e.g. 500 left and 500 right primers, this could construct ~250k tuples and calculate 250k Tms, but in reality the number is probably substantially smaller constrained by amplicon sizes.

@tfenne tfenne requested a review from msto November 14, 2024 03:27
@tfenne tfenne requested a review from nh13 as a code owner November 14, 2024 03:27
Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.66%. Comparing base (33dfeb9) to head (a7cef00).

Additional details and impacted files
@@                          Coverage Diff                          @@
##           tf_build_primers_list_to_sequence      #87      +/-   ##
=====================================================================
+ Coverage                              96.64%   96.66%   +0.02%     
=====================================================================
  Files                                     26       26              
  Lines                                   1697     1710      +13     
  Branches                                 330      334       +4     
=====================================================================
+ Hits                                    1640     1653      +13     
  Misses                                    31       31              
  Partials                                  26       26              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@msto msto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tfenne

Thanks for opening this, I think this will be a nice improvement.

I think this PR should likely be broken into ~3 separate PRs to facilitate the introduction of this feature.

  1. Fix the bug where amplicons that are too small are still emitted.
  2. Refactor build_primer_pairs() to extract the logic associated with the construction of PrimerPair objects into a classmethod on that class (closing Extract the body of the main for loop in build_primer_pairs() into an alternative constructor on PrimerPair #85).
  3. Ensure that primer pairs are emitted in penalty order.

Separating out this bug fix into a separate PR will make it easier to review that fix in isolation. Similarly, reducing the complexity of build_primer_pairs() before adding functionality to it will make it easier to review and test those changes. In particular, I think we lack coverage over any cases where primer pairs are actually built.

@@ -136,42 +139,60 @@ def build_primer_pairs(
region_end = max(p.span.end for p in right_primers)
bases = fasta.fetch(target.refname, region_start - 1, region_end)

# Each tuple is left_idx, right_idx, penalty, tm
pairings: list[Tuple[int, int, float, float]] = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion Could we make this a dataclass? A four element tuple with multiple elements of the same type seems destined to be mis-indexed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion I'd really like to get some test coverage over cases where pairs are actually generated, not just cases where we expect an error.


# Put it all together
return left_primer.penalty + right_primer.penalty + size_penalty + tm_penalty


def build_primer_pairs(
def build_primer_pairs( # noqa: C901
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion I think this is a good sign that now would be an appropriate time to consider implementing #85

Base automatically changed from tf_build_primers_list_to_sequence to main November 14, 2024 23:09
@fulcrumgenomics fulcrumgenomics deleted a comment from msto Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants