-
Notifications
You must be signed in to change notification settings - Fork 1
feat: implement full TMT analysis pipeline, result visualizations, and Windows compatibility. #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
44415ba
c65d503
c81fee6
fb2fe67
128da6d
5949251
e3c3665
ef8c83a
acbffa9
62de516
b32af7b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,132 @@ | ||
| """Abundance (ProteomicsLFQ) Results Page.""" | ||
| import streamlit as st | ||
| import pandas as pd | ||
| import numpy as np | ||
| from pathlib import Path | ||
| from scipy.stats import ttest_ind | ||
| from src.common.common import page_setup | ||
| from statsmodels.stats.multitest import multipletests | ||
| from src.common.results_helpers import get_workflow_dir, get_abundance_data | ||
|
|
||
| params = page_setup() | ||
| st.title("Abundance Quantification") | ||
|
|
||
| st.markdown( | ||
| """ | ||
| View protein and PSM-level quantification from **ProteomicsLFQ**. | ||
| This page calculates differential expression statistics between sample groups. | ||
| """ | ||
| ) | ||
|
|
||
| if "workspace" not in st.session_state: | ||
| st.warning("Please initialize your workspace first.") | ||
| st.stop() | ||
|
|
||
| results_dir = Path(st.session_state["workspace"]) / "topp-workflow" / "results" / "quant_results" | ||
| consensus_out = results_dir / "openms_design_protein_openms.csv" | ||
|
|
||
| @st.cache_data | ||
| def load_data(file_path): | ||
| return pd.read_csv(file_path, sep="\t", comment="#", engine="python") | ||
|
|
||
| if consensus_out.exists(): | ||
|
|
||
| # df = load_data(consensus_out) | ||
| # # ratio column removal | ||
| # df = df.loc[:, ~df.columns.str.contains('ratio', case=False)] | ||
|
|
||
| pre_processing_tab, protein_tab = st.tabs(["Pre-processing", "Protein Table"]) | ||
|
|
||
| with pre_processing_tab: | ||
| # result = get_abundance_data(st.session_state["workspace"]) | ||
| # DEBUG: 상세 원인 출력 (임시) | ||
| try: | ||
| result = get_abundance_data(st.session_state["workspace"]) | ||
| except Exception as e: | ||
| st.exception(e) | ||
| result = None | ||
|
|
||
| if result is None: | ||
| ws = st.session_state.get("workspace") | ||
| st.error("Debug: get_abundance_data returned None") | ||
| st.write("workspace:", ws) | ||
| wf = Path(ws) / "topp-workflow" | ||
| st.write("workflow dir exists:", wf.exists(), "->", wf) | ||
| qdir = wf / "results" / "quant_results" | ||
| st.write("quant_dir exists:", qdir.exists(), "->", qdir) | ||
| if qdir.exists(): | ||
| st.write("csv files:", sorted([p.name for p in qdir.glob('*.csv')])) | ||
| # show cached param snapshot if available | ||
| try: | ||
| from src.workflow.ParameterManager import ParameterManager | ||
| pm = ParameterManager(wf) | ||
| st.write("parameters keys (sample):", list(pm.get_parameters_from_json().items())[:20]) | ||
| except Exception as e: | ||
| st.write("Param manager error:", e) | ||
| st.stop() | ||
|
|
||
| if result is None: | ||
| st.info("💡 Please complete the configuration in the 'Configure' page to see results.") | ||
| st.stop() | ||
|
Comment on lines
+43
to
+70
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Debug/diagnostic code left in production and unreachable code block. Lines 43-66 contain extensive debug output ( ♻️ Proposed cleanup with pre_processing_tab:
- # result = get_abundance_data(st.session_state["workspace"])
- # DEBUG: 상세 원인 출력 (임시)
- try:
- result = get_abundance_data(st.session_state["workspace"])
- except Exception as e:
- st.exception(e)
- result = None
-
- if result is None:
- ws = st.session_state.get("workspace")
- st.error("Debug: get_abundance_data returned None")
- st.write("workspace:", ws)
- wf = Path(ws) / "topp-workflow"
- st.write("workflow dir exists:", wf.exists(), "->", wf)
- qdir = wf / "results" / "quant_results"
- st.write("quant_dir exists:", qdir.exists(), "->", qdir)
- if qdir.exists():
- st.write("csv files:", sorted([p.name for p in qdir.glob('*.csv')]))
- # show cached param snapshot if available
- try:
- from src.workflow.ParameterManager import ParameterManager
- pm = ParameterManager(wf)
- st.write("parameters keys (sample):", list(pm.get_parameters_from_json().items())[:20])
- except Exception as e:
- st.write("Param manager error:", e)
- st.stop()
-
+ result = get_abundance_data(st.session_state["workspace"])
+
if result is None:
st.info("💡 Please complete the configuration in the 'Configure' page to see results.")
st.stop()🧰 Tools🪛 Ruff (0.15.17)[warning] 45-45: Do not catch blind exception: (BLE001) [warning] 64-64: Do not catch blind exception: (BLE001) 🤖 Prompt for AI Agents |
||
|
|
||
| pivot_df, expr_df, group_map = result | ||
|
|
||
| st.write("### Final Results (Group row removed, Stats added)") | ||
| st.dataframe(pivot_df.head(10)) | ||
|
|
||
|
|
||
| with protein_tab: | ||
| st.markdown("### Protein-Level Abundance Table") | ||
|
|
||
| st.info( | ||
| "This protein-level table is generated by grouping all PSMs that map to the " | ||
| "same protein and aggregating their intensities across samples.\n\n" | ||
| "Additionally, log2 fold change and p-values are calculated between sample groups." | ||
| ) | ||
|
|
||
| # Display group comparison info | ||
| groups = sorted(set(group_map.values())) | ||
| if len(groups) >= 2: | ||
| group1, group2 = sorted(groups)[:2] | ||
| st.info(f"Statistical comparison: **{group2} vs {group1}**") | ||
|
|
||
| exclude_cols = ["protein", "log2FC", "p-value", "p-adj", | ||
| "n_proteins", "n_peptides", "protein_score"] | ||
|
|
||
| # Get sample columns (between stats and PeptideSequence) | ||
| sample_cols = [c for c in pivot_df.columns if c | ||
| not in exclude_cols and "ratio" not in c.lower()] | ||
|
|
||
| # Create bar chart column with log2-transformed values | ||
| pivot_df["Intensity"] = pivot_df[sample_cols].apply( | ||
| lambda row: [np.log2(v + 1) for v in row], axis=1 | ||
| ) | ||
|
|
||
| # Reorder columns: place Intensity after p-value | ||
| display_cols = ["protein", "log2FC", "p-value", "Intensity"] + sample_cols | ||
| available_cols = [c for c in display_cols if c in pivot_df.columns] | ||
|
|
||
| st.dataframe( | ||
| pivot_df[available_cols].sort_values("p-value"), | ||
| column_config={ | ||
| "Intensity": st.column_config.BarChartColumn( | ||
| "Intensity", | ||
| help="Sample intensities (log2 scale)", | ||
| width="small", | ||
| y_min=0, | ||
| ), | ||
| }, | ||
| width="stretch" | ||
| ) | ||
| else: | ||
| st.warning(f"File not found: {consensus_out}") | ||
|
|
||
| st.markdown("---") | ||
| st.markdown("**Next steps:** Explore statistical visualizations") | ||
| col1, col2, col3 = st.columns(3) | ||
| with col1: | ||
| st.page_link("content/results_volcano.py", label="Volcano Plot", icon="🌋") | ||
| with col2: | ||
| st.page_link("content/results_pca.py", label="PCA", icon="📊") | ||
| with col3: | ||
| st.page_link("content/results_heatmap.py", label="Heatmap", icon="🔥") | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| """Database Search (Comet) Results Page.""" | ||
| import streamlit as st | ||
| from pathlib import Path | ||
| from src.common.common import page_setup | ||
| from src.common.results_helpers import get_workflow_dir | ||
| from openms_insight import Table, Heatmap, LinePlot, SequenceView, StateManager | ||
|
|
||
| params = page_setup() | ||
| st.title("Database Search Results") | ||
|
|
||
| st.markdown( | ||
| """ | ||
| View peptide-spectrum matches (PSMs) identified by **Comet** database search. | ||
| Click on a PSM to view the annotated spectrum and peptide sequence. | ||
| """ | ||
| ) | ||
|
|
||
| st.info( | ||
| "**Score:** The e-value (expectation value) represents the expected number of random PSMs " | ||
| "with an equal or better score. Lower values indicate higher confidence identifications." | ||
| ) | ||
|
|
||
| if "workspace" not in st.session_state: | ||
| st.warning("Please initialize your workspace first.") | ||
| st.stop() | ||
|
|
||
| workflow_dir = get_workflow_dir(st.session_state["workspace"]) | ||
| comet_dir = workflow_dir / "results" / "comet_results" | ||
| cache_dir = workflow_dir / "results" / "insight_cache" | ||
|
|
||
| if not comet_dir.exists(): | ||
| st.info("No database search results available yet. Please run the workflow first.") | ||
| st.page_link("content/workflow_run.py", label="Go to Run", icon="🚀") | ||
| st.stop() | ||
|
|
||
| comet_files = sorted(comet_dir.glob("*.idXML")) | ||
|
|
||
| if not comet_files: | ||
| st.warning("No identification output files found.") | ||
| st.stop() | ||
|
|
||
| selected_file = st.selectbox( | ||
| "Select identification result file", | ||
| comet_files, | ||
| format_func=lambda x: x.name | ||
| ) | ||
|
|
||
| cache_id_prefix = selected_file.stem | ||
|
|
||
| # Check if cache exists | ||
| if not (cache_dir / f"table_{cache_id_prefix}").is_dir(): | ||
| st.warning("Visualization cache not found. Please re-run the workflow.") | ||
| st.stop() | ||
|
Comment on lines
+51
to
+53
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Preflight cache validation is incomplete across the OpenMS Insight pages. Each page validates only
📍 Affects 3 files
🤖 Prompt for AI Agents |
||
|
|
||
| # Initialize state manager for cross-component linking | ||
| state_manager = StateManager() | ||
|
|
||
| # Load components from cache (no data parameter needed) | ||
| table = Table(cache_id=f"table_{cache_id_prefix}", cache_path=str(cache_dir)) | ||
| heatmap = Heatmap(cache_id=f"heatmap_{cache_id_prefix}", cache_path=str(cache_dir)) | ||
| seq_view = SequenceView(cache_id=f"seqview_{cache_id_prefix}", cache_path=str(cache_dir)) | ||
| line_plot = LinePlot(cache_id=f"lineplot_{cache_id_prefix}", cache_path=str(cache_dir)) | ||
|
|
||
| # Render components | ||
| st.subheader("PSM Overview") | ||
| heatmap(state_manager=state_manager, height=350) | ||
|
|
||
| st.subheader("PSM Table") | ||
| table(state_manager=state_manager, height=533) | ||
|
|
||
| st.subheader("Peptide Sequence") | ||
| seq_view(key=f"seqview_{cache_id_prefix}", state_manager=state_manager, height=533) | ||
|
|
||
| st.subheader("MS2 Spectrum") | ||
| line_plot(key=f"lineplot_{cache_id_prefix}", state_manager=state_manager, height=450, sequence_view_key=f"seqview_{cache_id_prefix}") | ||
|
|
||
| st.markdown("---") | ||
| st.markdown("**Next step:** View rescoring results") | ||
| st.page_link("content/results_rescoring.py", label="Go to Rescoring", icon="📈") | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: OpenMS/quantms-web
Length of output: 1951
🌐 Web query:
actions/setup-python@v4 deprecated end of life status 2024 2025 2026💡 Result:
As of June 16, 2026, the actions/setup-python@v4 action is not officially marked as deprecated, but it is considered a legacy major version [1][2]. The official recommendation is to use the latest major version, which is v6 (as of the latest release v6.2.0 in January 2026) [1][3]. While v4 may continue to function in many workflows, it relies on older underlying technology (such as older Node.js runtimes) compared to current releases [1][4][3]. Key points regarding the status of setup-python: - Versioning and Support: GitHub Actions maintainers follow a practice of releasing new major versions to accommodate breaking changes, such as upgrades to the underlying Node.js runtime [1][3]. v4 was superseded by v5 (introduced around late 2023) and subsequently v6 [5][3]. - Deprecation Policy: GitHub typically manages the lifecycle of its actions by phasing out older major versions as newer ones become the standard [6]. Users are encouraged to migrate to the latest major version to ensure compatibility with updated GitHub runner environments, which may eventually drop support for the Node.js versions required by legacy actions [1][4][3]. - Usage: If your workflows are currently using
@v4, they should be updated to@v6to leverage the latest features, security patches, and compatibility improvements [1][3]. There is no specific EOL (End of Life) date publicly posted for v4 in the same manner as OS images (like Ubuntu 20.04) [7], but it is effectively obsolete for new or maintained projects [5][3].Citations:
[DEP0040] DeprecationWarning: Thepunycodemodule is deprecated. Please use a userland alternative instead.actions/setup-python#1208🌐 Web query:
GitHub Actions best practices pin commit SHA major version tags security💡 Result:
The most effective security best practice for GitHub Actions is to pin every action to a full-length commit SHA [1][2][3]. Unlike version tags or branch names, which are mutable labels that can be redirected to malicious code by an attacker with repository access, a full-length commit SHA is cryptographically immutable and guarantees that the exact code you intend to run is executed every time [1][4][5][6]. Security Implications Using version tags (e.g.,
@v4) or branches (e.g.,@main) creates a continuous, implicit trust relationship with the upstream repository [7]. If an attacker compromises the maintainer's account or gains write access to the repository, they can silently force-push updates to existing tags [1][8][6]. Downstream consumers relying on those tags will unknowingly execute the malicious code on their next workflow run, even if their own workflow files remain unchanged [4][7][9]. This has been a recurring vector for supply chain attacks in the GitHub Actions ecosystem [10][8][7]. Implementation and Maintenance Best Practices To manage the security and maintenance burden of SHA pinning: 1. Pin to SHA with a Human-Readable Comment: To maintain clarity, always include the version tag in a comment after the SHA [10]. This allows developers to understand which version they are using while gaining the security of immutability [4][10]. Example: uses: actions/checkout@3df4ab1 # v4.0.0 2. Automate Dependency Updates: Do not manually manage SHAs for all actions. Use dependency management tools like Dependabot or Renovate to automatically create pull requests that update your pinned SHAs [11][1][10]. For example, Renovate supports specific presets (e.g., helpers:pinGitHubActionDigests) that keep these hashes current [11][10]. 3. Enforce via Organization Policy: GitHub provides organization and repository-level policies that can block the use of actions not pinned to a full-length commit SHA [12][2][5]. Enabling this policy is a highly recommended security hardening step, as it forces all workflows to comply with immutable referencing [12][10]. 4. Audit the Dependency Tree: Remember that an action you use may rely on other actions internally. If you are an action maintainer, you should also pin your internal dependencies to full commit SHAs to ensure your consumers remain secure and compliant with organization-level policies [5]. By treating version tags only as discovery labels rather than trust anchors, you significantly reduce the risk of supply chain compromise in your CI/CD pipelines [3].Citations:
Update to latest version and pin GitHub Actions to commit SHAs.
Lines 14–15 use outdated action major versions:
actions/checkout@v4andactions/setup-python@v4are superseded by v4 for checkout and v6 for setup-python (current as of 2026). Additionally, both are unpinned version tags, which introduces supply-chain risk; version tags are mutable and can be redirected to malicious code. Use the latest major versions pinned to full commit SHAs with a version comment for clarity, e.g.,actions/checkout@<commit-sha> # v4.x.xandactions/setup-python@<commit-sha> # v6.x.x.🧰 Tools
🪛 actionlint (1.7.12)
[error] 15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 zizmor (1.25.2)
[warning] 14-14: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 14-14: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 15-15: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Source: Linters/SAST tools