Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Template for adding a new model #21

Merged
merged 10 commits into from
Oct 24, 2024
Merged

Template for adding a new model #21

merged 10 commits into from
Oct 24, 2024

Conversation

kamilest
Copy link
Collaborator

@kamilest kamilest commented Oct 11, 2024

The purpose of this PR is to track and facilitate discussion regarding the information that needs to be included when submitting a new model to MEDS-DEV.

cc: @teyaberg @aleksiakolo

Summary by CodeRabbit

  • New Features

    • Added a warning in the README indicating that MEDS-DEV currently only supports binary classification tasks.
    • Introduced a README template to assist contributors in documenting models, datasets, tasks, and results.
    • Added a new model documentation template with comprehensive sections for detailed model descriptions.
  • Documentation

    • Updated the evaluation section to clarify binary classification support.
    • Included TODO comments for enhancing documentation in various areas.

Copy link
Contributor

coderabbitai bot commented Oct 11, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

The changes in this pull request primarily involve updates to the README files within the MEDS-DEV repository. A warning has been added to indicate that the framework currently supports only binary classification tasks. Additionally, several TODO comments have been included to signal areas needing further information. A new README template has been introduced to guide contributors in documenting their work, and a model documentation template has been created to standardize the documentation process for models within the framework.

Changes

File Path Change Summary
README.md Added a warning about binary classification support and several TODO comments for future updates.
src/MEDS_DEV/templates/README.md Introduced a new README template for contributors, with a description of its purpose.
src/MEDS_DEV/templates/model.md Added a new model documentation template, including a warning about binary classification support.

Possibly related PRs

  • Add a basic example workflow. #16: The changes in the README.md file of this PR include clarifications about the MEDS-DEV project and its functionalities, which are relevant to the updates made in the main PR regarding binary classification tasks and the overall structure of the README.
  • Update Datasets README.md to clarify dataset contribution workflow #26: This PR updates the README.md in the datasets directory to clarify the dataset contribution workflow, which relates to the main PR's focus on enhancing documentation and providing clearer guidance in the README.

Suggested reviewers

  • mmcdermott

🐰 In the garden, I hop and play,
With templates bright to guide the way.
A warning here, a TODO there,
For binary tasks, we must prepare!
So let’s document with joy and cheer,
For every model, we hold dear! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (7)
src/MEDS_DEV/templates/model.md (1)

1-1: Enhance visibility of the placeholder text.

The placeholder for the model name is clear, but it could be made more visible to ensure users don't overlook it.

Consider using a different format for the placeholder, such as:

-# [New model] \*\*Add your model name here\*\* 
+# [New model] <ADD YOUR MODEL NAME HERE>

This change makes the placeholder stand out more and follows a common convention for template placeholders.

README.md (6)

119-120: Good placeholder for additional documentation.

The TODO comment correctly identifies the need for more information about the MEDS-evaluation schema. This is important for users to understand how to properly evaluate their models.

To ensure this task is not overlooked, would you like me to create a GitHub issue to track the addition of information about the MEDS-evaluation schema?


139-141: Consider adding basic structure to the "To Add a Model" section.

While the TODO comment correctly identifies that this section needs to be completed, an empty section provides little guidance to potential contributors. Consider adding a basic structure or some initial guidelines for adding a model, even if the details are filled in later. This could include:

  1. Required information for a new model
  2. Steps to integrate a model into the project
  3. Any specific formatting or documentation requirements

This structure will make it easier for contributors to understand what's expected and for maintainers to fill in the details later.

Would you like me to propose a basic structure for this section?


Line range hint 143-145: Expand the "To Add a Dataset" section with more details.

The current "TODO" placeholder provides insufficient guidance for contributors. To improve this section:

  1. Outline the basic steps for adding a dataset.
  2. Specify any required metadata or documentation for new datasets.
  3. Mention any specific file formats or structures that datasets should adhere to.
  4. Provide information on how to integrate the dataset with existing tasks or models.

Adding this structure will make it easier for contributors to understand the process and requirements for adding new datasets to the project.

Would you like me to draft a more detailed outline for this section?


Line range hint 147-149: Provide more information in the "To Add a Task" section.

The current "TODO" placeholder doesn't offer any guidance for contributors. To improve this section:

  1. Define what constitutes a task in the context of this project.
  2. Outline the process for defining and implementing a new task.
  3. Specify any required configuration files or scripts for tasks.
  4. Explain how tasks interact with datasets and models.
  5. Provide an example of a simple task definition, if possible.

Adding this information will help contributors understand how to effectively add new tasks to the project.

Would you like me to propose a basic structure for this section, including a simple example of a task definition?


Line range hint 151-153: Expand the "To Add Results" section with more details.

The current "TODO" placeholder provides no guidance for contributors. To improve this section:

  1. Specify what kind of results should be added (e.g., model performance metrics, dataset statistics).
  2. Outline the process for submitting results.
  3. Define any required format for result submissions (e.g., specific file formats, naming conventions).
  4. Explain how results will be validated and integrated into the project.
  5. Provide guidelines on how to interpret and compare results.

Adding this information will ensure consistency in result submissions and make it easier for contributors to understand how their results fit into the larger project context.

Would you like me to draft a more detailed outline for this section, including examples of how to format and submit results?


Line range hint 1-153: Overall feedback on README changes

The addition of the warning about binary classification support is a valuable update. However, the multiple TODO placeholders throughout the document, while identifying areas for improvement, don't provide sufficient guidance for contributors.

To enhance the usefulness of this README:

  1. Prioritize filling in the empty sections, particularly "To Add a Model", "To Add a Dataset", "To Add a Task", and "To Add Results".
  2. For each section, provide at least a basic structure or outline of the required information.
  3. Consider adding examples or templates where appropriate to guide contributors.
  4. Ensure that the information about evaluating models is complete and up-to-date.

These improvements will make the README more informative and increase its value as a guide for both users and potential contributors to the MEDS-DEV project.

Would you like assistance in drafting outlines or examples for any of these sections to jumpstart the documentation process?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 30d5631 and b8cc8c9.

📒 Files selected for processing (3)
  • README.md (3 hunks)
  • src/MEDS_DEV/templates/README.md (1 hunks)
  • src/MEDS_DEV/templates/model.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/MEDS_DEV/templates/README.md
🧰 Additional context used
🪛 LanguageTool
src/MEDS_DEV/templates/model.md

[grammar] ~18-~18: The plural noun “instructions” cannot be used with the article “a”. Did you mean “a step-by-step instruction” or “step-by-step instructions”?
Context: .... ## Running the model Please provide a step-by-step instructions for running your model, including insta...

(A_NNS)

🔇 Additional comments (3)
src/MEDS_DEV/templates/model.md (2)

3-5: LGTM: Clear and prominent warning.

The warning about MEDS-DEV only supporting binary classification tasks is well-placed and clearly formatted. It effectively communicates an important limitation to users.


1-34: Overall assessment: Good template structure with room for enhancements.

This new template for model documentation in the MEDS-DEV project provides a solid foundation for consistent and comprehensive model descriptions. The structure covers all essential aspects of model documentation, including description, dependencies, running instructions, resources, and compliance checklist.

The suggested improvements aim to:

  1. Enhance visibility of placeholders
  2. Provide more structured formats for various sections
  3. Correct grammatical issues
  4. Increase clarity and completeness of instructions

Implementing these suggestions will result in a more robust template that guides users to provide thorough and consistent documentation for their models.

🧰 Tools
🪛 LanguageTool

[grammar] ~18-~18: The plural noun “instructions” cannot be used with the article “a”. Did you mean “a step-by-step instruction” or “step-by-step instructions”?
Context: .... ## Running the model Please provide a step-by-step instructions for running your model, including insta...

(A_NNS)

README.md (1)

3-5: Excellent addition of the warning notice.

The warning about MEDS-DEV currently only supporting binary classification tasks is clear, concise, and well-placed at the beginning of the README. This information is crucial for users and potential contributors to understand the current limitations of the project.

src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (1)
.pre-commit-config.yaml (1)

128-128: LGTM! Consider adding comments for ignored error codes.

The consolidation of nbqa-flake8 arguments into a single line improves consistency with other hook configurations in the file. The functionality remains unchanged, which is good.

For improved maintainability, consider adding comments to explain the ignored error codes (E203, E402, E501, F401, F841). This will help future contributors understand why these specific errors are being ignored.

Here's a suggested modification with comments:

- id: nbqa-flake8
  args: [
    "--extend-ignore=E203,E402,E501,F401,F841",  # Ignore whitespace, import, line length, unused import, and unused variable errors
    "--exclude=logs/*,data/*"  # Exclude logs and data directories from linting
  ]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between b8cc8c9 and 2995e29.

📒 Files selected for processing (4)
  • .pre-commit-config.yaml (1 hunks)
  • README.md (1 hunks)
  • src/MEDS_DEV/templates/README.md (1 hunks)
  • src/MEDS_DEV/templates/model.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • README.md
  • src/MEDS_DEV/templates/README.md
🧰 Additional context used
🪛 LanguageTool
src/MEDS_DEV/templates/model.md

[grammar] ~18-~18: The plural noun “instructions” cannot be used with the article “a”. Did you mean “a step-by-step instruction” or “step-by-step instructions”?
Context: .... ## Running the model Please provide a step-by-step instructions for running your model, including insta...

(A_NNS)

🔇 Additional comments (3)
src/MEDS_DEV/templates/model.md (3)

1-5: LGTM: Clear model name placeholder and important warning.

The model name placeholder is clear and instructive. The warning about MEDS-DEV only supporting binary classification tasks is important and prominently displayed using the correct Markdown syntax for GitHub warnings.


7-10: 🛠️ Refactor suggestion

Enhance structure of the description section.

While the current instructions are clear, adding more structure could help ensure consistency across model descriptions and make it easier for users to provide comprehensive information.

Consider updating the section as follows:

 ## Description

-Add a brief description of your model, including the details of its design and its intended purpose
-(what sorts of tasks is it designed for).
+Provide a brief description of your model, including:
+
+- Model architecture and key components
+- Intended purpose and target tasks
+- Any notable features or innovations
+- Performance characteristics (e.g., speed, accuracy)
+- Limitations or constraints

This structure guides users to cover all essential aspects of their model in a consistent format.

Likely invalid or redundant comment.


12-14: 🛠️ Refactor suggestion

Provide a structured format for listing dependencies.

To ensure consistency and completeness in dependency information, consider providing a more structured format for users to follow.

Update the section as follows:

 ## Dependencies

-Add any dependencies that your model may have.
+List all dependencies required to run your model. For each dependency, provide:
+
+- Name of the dependency
+- Version number (if applicable)
+- Any specific installation instructions (if needed)
+
+Example:
+- Python 3.8+
+- TensorFlow 2.4.0
+- NumPy 1.19.5
+- Pandas 1.2.0

This format will help users provide clear and consistent information about their model's dependencies.

Likely invalid or redundant comment.

src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
@mmcdermott
Copy link
Owner

@kamilest is this ready for review, or should we convert it to a draft for now? I'm going through outstanding PRs to make sure I'm not blocking anything that needs my review.

kamilest and others added 2 commits October 22, 2024 17:11
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@kamilest
Copy link
Collaborator Author

@kamilest is this ready for review, or should we convert it to a draft for now? I'm going through outstanding PRs to make sure I'm not blocking anything that needs my review.

@mmcdermott Given no objections from the others I will assume this is good enough to merge shortly (after I review coderabbit's comments), but if you have time to also take a look at this I would appreciate it!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
src/MEDS_DEV/templates/model.md (2)

1-2: Enhance title format for better readability.

The current title format could be improved to be more user-friendly and consistent with common markdown practices.

-# \[New model\] \*\*Add your model name here\*\*
+# [New Model] Add Your Model Name Here

1-53: Enhance formatting consistency across sections.

Consider applying consistent formatting throughout the document:

  1. Add consistent spacing between sections (one newline before each heading)
  2. Use consistent heading levels for subsections
  3. Ensure consistent bullet point indentation

Would you like me to provide a complete diff showing these formatting improvements?

🧰 Tools
🪛 LanguageTool

[grammar] ~12-~12: The infinitive form should be used here.
Context: ...hitecture and key components - Intended purpose and target tasks - Any notable features...

(MEAN_TO_WRITE)


[grammar] ~40-~40: The plural noun “instructions” cannot be used with the article “a”. Did you mean “a step-by-step instruction” or “step-by-step instructions”?
Context: ...0 ## Running the model Please provide a step-by-step instructions for running your model, including insta...

(A_NNS)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 2995e29 and 28bc2c8.

📒 Files selected for processing (1)
  • src/MEDS_DEV/templates/model.md (1 hunks)
🧰 Additional context used
📓 Learnings (1)
src/MEDS_DEV/templates/model.md (1)
Learnt from: Oufattole
PR: mmcdermott/MEDS-DEV#21
File: src/MEDS_DEV/templates/model.md:7-10
Timestamp: 2024-10-11T15:34:38.690Z
Learning: When updating the model documentation template (`src/MEDS_DEV/templates/model.md`), move performance characteristics to a separate section titled *Computational Requirements*, which should include details about performance metrics and the hardware specifications used by the submitter.
🪛 LanguageTool
src/MEDS_DEV/templates/model.md

[grammar] ~12-~12: The infinitive form should be used here.
Context: ...hitecture and key components - Intended purpose and target tasks - Any notable features...

(MEAN_TO_WRITE)


[grammar] ~40-~40: The plural noun “instructions” cannot be used with the article “a”. Did you mean “a step-by-step instruction” or “step-by-step instructions”?
Context: ...0 ## Running the model Please provide a step-by-step instructions for running your model, including insta...

(A_NNS)

🔇 Additional comments (2)
src/MEDS_DEV/templates/model.md (2)

7-23: LGTM! Well-structured sections.

The Description and Computational Requirements sections are well-organized and provide clear guidance for model documentation.

🧰 Tools
🪛 LanguageTool

[grammar] ~12-~12: The infinitive form should be used here.
Context: ...hitecture and key components - Intended purpose and target tasks - Any notable features...

(MEAN_TO_WRITE)


24-37: LGTM! Clear dependency documentation format.

The Dependencies section provides a well-structured format with a helpful example that clearly illustrates how to document dependencies.

Copy link
Collaborator

@Oufattole Oufattole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
src/MEDS_DEV/templates/model.md Outdated Show resolved Hide resolved
@mmcdermott
Copy link
Owner

LGTM @kamilest

@kamilest kamilest merged commit a64636e into main Oct 24, 2024
5 checks passed
This was referenced Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants