-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit primer pairs in penalty order. #87
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## tf_build_primers_list_to_sequence #87 +/- ##
=====================================================================
+ Coverage 96.64% 96.66% +0.02%
=====================================================================
Files 26 26
Lines 1697 1710 +13
Branches 330 334 +4
=====================================================================
+ Hits 1640 1653 +13
Misses 31 31
Partials 26 26 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @tfenne
Thanks for opening this, I think this will be a nice improvement.
I think this PR should likely be broken into ~3 separate PRs to facilitate the introduction of this feature.
- Fix the bug where amplicons that are too small are still emitted.
- Refactor
build_primer_pairs()
to extract the logic associated with the construction ofPrimerPair
objects into aclassmethod
on that class (closing Extract the body of the main for loop inbuild_primer_pairs()
into an alternative constructor onPrimerPair
#85). - Ensure that primer pairs are emitted in penalty order.
Separating out this bug fix into a separate PR will make it easier to review that fix in isolation. Similarly, reducing the complexity of build_primer_pairs()
before adding functionality to it will make it easier to review and test those changes. In particular, I think we lack coverage over any cases where primer pairs are actually built.
@@ -136,42 +139,60 @@ def build_primer_pairs( | |||
region_end = max(p.span.end for p in right_primers) | |||
bases = fasta.fetch(target.refname, region_start - 1, region_end) | |||
|
|||
# Each tuple is left_idx, right_idx, penalty, tm | |||
pairings: list[Tuple[int, int, float, float]] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion Could we make this a dataclass? A four element tuple with multiple elements of the same type seems destined to be mis-indexed.
tests/api/test_picking.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion I'd really like to get some test coverage over cases where pairs are actually generated, not just cases where we expect an error.
|
||
# Put it all together | ||
return left_primer.penalty + right_primer.penalty + size_penalty + tm_penalty | ||
|
||
|
||
def build_primer_pairs( | ||
def build_primer_pairs( # noqa: C901 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion I think this is a good sign that now would be an appropriate time to consider implementing #85
This PR does two things:
(2) requires materializing a small tuple for all valid pairs, and then sorting by score. The tuple contains two ints (indices into the primer sequences) and two floats (the penalty and the tm, the last for convenience so we don't have to recompute it). It the sorts the tuples by penalty, and starts generating PrimerPairs in penalty order.
If you have e.g. 500 left and 500 right primers, this could construct ~250k tuples and calculate 250k Tms, but in reality the number is probably substantially smaller constrained by amplicon sizes.