Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generating results for a sequential modeling pipeline (to compare to our current pipeline) #132

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

ejiawustl
Copy link
Collaborator

modifying main.py to generate results for the sequential pipeline modeling (to compare to the current pipeline) for bootstrap_lassocv method. Note that a slight modification was made to the source code to utilize an additional keyword argument in generate_modeling_data, but this essentially doesn't change the source code at all

…eling (to compare to the current pipeline) for bootstrap_lassocv method. Note that a slight modification was made to the source code to utilize an additional keyword argument in generate_modeling_data, but this essentially doesn't change the source code at all
@ejiawustl ejiawustl requested a review from cmatKhan December 2, 2024 18:13
# Rest of the code remains the same...
# Store the results under a new directory
sequential_output_dir = os.path.join(
output_dirpath, "sequential_top_genes_results"
Copy link
Member

@cmatKhan cmatKhan Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in your results_20241202, I see this pattern:

results_20241202/ACE2/ACE2/

Where there is a directory with the same name nested under the other. As far as I know, this wasn't happening before. It isn't immediately clear to me that it is in this line that this occurs, but nowhere else stands out.

On a related note: somewhere in the docs, in the cmd line --help, or even as a README that gets deposited in every results otuput directory, there needs to be a definition of what "sequential_top_genes_results" are.

I agree with you that keeping this in the main workflow for now (meaning, not creating another cmd line option to turn this feature on and off) is a good idea. It just needs to be documented so that it is clear what these results are.

Copy link
Member

@cmatKhan cmatKhan Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than this, it looks good to me in terms of format. I haven't pulled and run this, and __main__ is where my testing is not strict. My strategy is to implement the logic elsewhere, and do only very simple things that I am quite certain of in __main__ (not a strong strategy). Please do make sure that you go over your added code carefully. Play with it in a console/notebook if you need to be sure of something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants