Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(vignette): Add ccrcc dataset and ID conversion to vignettes #35

Merged
merged 3 commits into from
Jan 24, 2025

Conversation

tonywu1999
Copy link
Contributor

@tonywu1999 tonywu1999 commented Jan 24, 2025

PR Type

Documentation, Enhancement


Description

  • Updated vignette to include CCRCC dataset and ID conversion.

  • Enhanced annotateProteinInfoFromIndra function for better ID handling.

  • Improved examples for ID conversion and subnetwork queries.

  • Added references to external datasets and tools.


Changes walkthrough 📝

Relevant files
Enhancement
annotateProteinInfoFromIndra.R
Improved handling of `Protein` ID data type                           

R/annotateProteinInfoFromIndra.R

  • Modified .populateUniprotIdsInDataFrame to cast Protein to character.
  • Ensures compatibility with different data types for Protein.
  • +1/-1     
    Documentation
    MSstatsBioNet.Rmd
    Updated vignette with CCRCC dataset and ID conversion       

    vignettes/MSstatsBioNet.Rmd

  • Added CCRCC dataset example with reference to external paper.
  • Updated ID conversion example using annotateProteinInfoFromIndra.
  • Enhanced subnetwork query example with additional outputs.
  • Replaced outdated examples with streamlined workflows.
  • +23/-31 

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Possible Issue

    The added line df$UniprotId <- as.character(df$Protein) in the if (proteinIdType == "Uniprot") block may introduce unintended behavior if df$Protein contains non-character data types. Ensure that this conversion is necessary and does not lead to data corruption or loss.

    df$UniprotId <- as.character(df$Protein)
    Dataset Reference Clarity

    The new dataset reference in the vignette points to a specific paper and file path. Ensure that the dataset is accessible and the reference is clear for users to replicate the example without issues.

    We will be taking a subset of the dataset found in this [paper](https://pmc.ncbi.nlm.nih.gov/articles/PMC7331093/).  
    
    ```{r}
    input = data.table::fread(system.file(
        "extdata/msstats.csv",
        package = "MSstatsBioNet"
    ))
    
    </details>
    
    <details><summary><a href='https://github.com/Vitek-Lab/MSstatsBioNet/pull/35/files#diff-ea524a7e9f2f63335edd3c785c469e00078c9aacf6bfd5c2ff95da29a998a607R109-R111'><strong>Subnetwork Query Example</strong></a>
    
    The updated example for `getSubnetworkFromIndra` introduces a `pvalueCutoff` parameter. Verify that this parameter is well-documented and its behavior is clear to users.</summary>
    
    ```txt
    subnetwork <- getSubnetworkFromIndra(annotated_df, pvalueCutoff = 0.05)
    head(subnetwork$nodes)
    head(subnetwork$edges)
    

    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    General
    Add error handling for file reading

    Add error handling for the data.table::fread function to ensure that the file path
    is valid and the file exists, preventing runtime errors.

    vignettes/MSstatsBioNet.Rmd [45-48]

    -input = data.table::fread(system.file(
    -    "extdata/msstats.csv",
    -    package = "MSstatsBioNet"
    -))
    +input = tryCatch({
    +    data.table::fread(system.file(
    +        "extdata/msstats.csv",
    +        package = "MSstatsBioNet"
    +    ))
    +}, error = function(e) {
    +    stop("Error reading input file: ", e$message)
    +})
    Suggestion importance[1-10]: 8

    Why: Adding error handling for file reading is a significant improvement, as it prevents runtime errors and provides meaningful feedback to the user in case of issues. This enhances the robustness of the code.

    8
    Validate parameter range for robustness

    Add a check to ensure that the pvalueCutoff parameter in getSubnetworkFromIndra is
    within a valid range (e.g., between 0 and 1) to avoid unexpected behavior.

    vignettes/MSstatsBioNet.Rmd [109]

    +if (pvalueCutoff < 0 || pvalueCutoff > 1) {
    +    stop("pvalueCutoff must be between 0 and 1.")
    +}
     subnetwork <- getSubnetworkFromIndra(annotated_df, pvalueCutoff = 0.05)
    Suggestion importance[1-10]: 6

    Why: Adding a check for the pvalueCutoff parameter range improves the robustness of the code by ensuring valid input. However, the suggestion does not account for how pvalueCutoff is defined or passed in the current context, slightly reducing its impact.

    6
    Possible issue
    Validate input structure for function

    Validate the structure of model$ComparisonResult before passing it to
    annotateProteinInfoFromIndra to ensure it contains the required columns and data.

    vignettes/MSstatsBioNet.Rmd [98]

    +if (!all(c("Protein", "OtherRequiredColumn") %in% colnames(model$ComparisonResult))) {
    +    stop("model$ComparisonResult does not have the required structure.")
    +}
     annotated_df = annotateProteinInfoFromIndra(model$ComparisonResult, "Uniprot")
    Suggestion importance[1-10]: 7

    Why: Validating the structure of model$ComparisonResult ensures that the function receives the expected input, preventing potential runtime errors. This is a useful addition for improving code reliability.

    7

    @codecov-commenter
    Copy link

    Codecov Report

    All modified and coverable lines are covered by tests ✅

    Project coverage is 82.79%. Comparing base (fc732dd) to head (afe1acd).

    Additional details and impacted files
    @@           Coverage Diff           @@
    ##            devel      #35   +/-   ##
    =======================================
      Coverage   82.79%   82.79%           
    =======================================
      Files           7        7           
      Lines         465      465           
    =======================================
      Hits          385      385           
      Misses         80       80           

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    @tonywu1999
    Copy link
    Contributor Author

    #25

    @tonywu1999 tonywu1999 merged commit bf4235e into devel Jan 24, 2025
    4 checks passed
    @tonywu1999 tonywu1999 deleted the ccrcc branch January 24, 2025 14:28
    @tonywu1999
    Copy link
    Contributor Author

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants