Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bgcflow integration with panalleleome #338

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

anpanche
Copy link

No description provided.

@matinnuhamunada
Copy link
Collaborator

matinnuhamunada commented Mar 26, 2024

Hi @anpanche, thanks for the PR. I have some questions before starting the integration process. What are the differences between this two repositories?

@anpanche
Copy link
Author

Hi @anpanche, thanks for the PR. I have some questions before starting the integration process. What are the differences between this two repositories?

The core alleleome finds the sequence diversity of only core genes of the pangenome. The core genes represent the set of genes which are present in >90% strains.
The qcqa process in alleleome analysis checks for the presence of genes in atleast 5% of strains and further removes those genes that don't satisfy the condition.
Since core genes represent the set of genes present in >90% strains, this step is not necessary for it.
In pangenes(core, accessory and rare)this step is essential and it will affect the gene sets to be carried forward for the alleleome.In PAN alleleome this step is introduced.

@matinnuhamunada
Copy link
Collaborator

Hi @anpanche, thanks for the PR. I have some questions before starting the integration process. What are the differences between this two repositories?

The core alleleome finds the sequence diversity of only core genes of the pangenome. The core genes represent the set of genes which are present in >90% strains. The qcqa process in alleleome analysis checks for the presence of genes in atleast 5% of strains and further removes those genes that don't satisfy the condition. Since core genes represent the set of genes present in >90% strains, this step is not necessary for it. In pangenes(core, accessory and rare)this step is essential and it will affect the gene sets to be carried forward for the alleleome.In PAN alleleome this step is introduced.

Can't this two repository be merged together and then you introduce the CLI to whether run the Core or Pan pipeline?

@anpanche
Copy link
Author

Yes, this can also be done.

@anpanche
Copy link
Author

Hi @matinnuhamunada ,
which kind of integration will be more convenient? So I can work accordingly. If we decide to integrate corealleleome with panalleleome how it would be used in bgcflow? What changes I need to make and additional scripts to be written?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants