Version: 1.0.0
Commit Hash: 4e9c80c94e6dadf3cc7dec83a77540de66c9ef81
Author: Toan Phung
Category: analysis
Subcategory: protein-conservation
Protein sequence conservation analysis using ConSurf standalone. Identifies conserved and variable regions in protein sequences based on homology search and multiple sequence alignment. Based on: Ashkenazy et al. (2016) Nucleic Acids Res 44(W1):W344-W350. Repository: https://github.com/Rostlab/ConSurf
ConSurf Conservation Analysis
Installation
⬇️ Click here to install in Cauldron (requires Cauldron to be running)
Repository:
https://github.com/noatgnu/consurf-standalone-plugin
Manual installation:
- Open Cauldron
- Go to Plugins → Install from Repository
- Paste:
https://github.com/noatgnu/consurf-standalone-plugin - Click Install
ID: consurf-conservation
Version: 1.0.0
Category: analysis
Author: Toan Phung
Description
Protein sequence conservation analysis using ConSurf standalone. Identifies conserved and variable regions in protein sequences based on homology search and multiple sequence alignment. Based on: Ashkenazy et al. (2016) Nucleic Acids Res 44(W1):W344-W350. Repository: https://github.com/Rostlab/ConSurf
Runtime
-
Environments:
docker -
Entrypoint:
/opt/miniconda/bin/conda run -n consurf_env --no-capture-output python /workspace/stand_alone_consurf/stand_alone_consurf.py
Inputs
| Name | Label | Type | Required | Default | Visibility |
|---|---|---|---|---|---|
query_sequence |
Query Protein Sequence | file | Yes | - | Always visible |
fasta_database |
Protein FASTA Database | file | Yes | - | Always visible |
algorithm |
Homolog Search Algorithm | select (HMMER, BLAST) | Yes | HMMER | Always visible |
max_homologs |
Maximum Homologs | number (min: 10, max: 500, step: 10) | Yes | 150 | Always visible |
substitution_model |
Substitution Model | select (BEST, JTT, LG, WAG, cpREV, mtREV, Dayhoff) | Yes | BEST | Always visible |
max_id |
Maximum Sequence Identity (%) | number (min: 50, max: 100, step: 5) | Yes | 95 | Always visible |
min_id |
Minimum Sequence Identity (%) | number (min: 1, max: 50, step: 5) | Yes | 35 | Always visible |
cutoff |
E-value Cutoff | number (min: 0, max: 1, step: 0) | Yes | 0.0001 | Always visible |
max_iterations |
Maximum Iterations | number (min: 1, max: 5, step: 1) | Yes | 1 | Always visible |
maximum_likelihood |
Use Maximum Likelihood | boolean | No | false | Always visible |
closest |
Use Closest Homologs Only | boolean | No | false | Always visible |
msa_file |
Multiple Sequence Alignment (Optional) | file | No | - | Always visible |
alignment_program |
Alignment Program | select (MAFFT, MUSCLE, CLUSTALW) | No | - | Always visible |
structure_file |
PDB Structure File (Optional) | file | No | - | Always visible |
chain |
PDB Chain ID | text | No | A | Always visible |
query_name |
Query Sequence Name | text | No | - | Always visible |
Input Details
Query Protein Sequence (query_sequence)
Protein sequence in FASTA format for conservation analysis
Protein FASTA Database (fasta_database)
Database of protein sequences for homolog search (e.g., UniRef90, UniProt)
Homolog Search Algorithm (algorithm)
Algorithm for searching homologous sequences
- Options:
HMMER,BLAST
Maximum Homologs (max_homologs)
Maximum number of homologous sequences to include in MSA
Substitution Model (substitution_model)
Amino acid substitution model for conservation calculation
- Options:
BEST,JTT,LG,WAG,cpREV,mtREV,Dayhoff
Maximum Sequence Identity (%) (max_id)
Maximum sequence identity threshold for homolog filtering
Minimum Sequence Identity (%) (min_id)
Minimum sequence identity threshold for homolog filtering
E-value Cutoff (cutoff)
E-value threshold for homolog search significance
Maximum Iterations (max_iterations)
Number of PSI-BLAST iterations (if BLAST selected)
Use Maximum Likelihood (maximum_likelihood)
Use maximum likelihood method for rate calculation (slower but more accurate)
Use Closest Homologs Only (closest)
Select only the closest homologs based on similarity
Multiple Sequence Alignment (Optional) (msa_file)
Pre-computed MSA file. If provided, skips homolog search and alignment steps
Alignment Program (alignment_program)
Program for multiple sequence alignment (used if MSA not provided)
- Options:
MAFFT,MUSCLE,CLUSTALW
PDB Structure File (Optional) (structure_file)
Protein structure file for mapping conservation to 3D structure
PDB Chain ID (chain)
Chain identifier in PDB file (e.g., A, B, C)
- Placeholder:
A
Query Sequence Name (query_name)
Custom name for the query sequence in outputs
- Placeholder:
MyProtein
Outputs
| Name | File | Type | Format | Description |
|---|---|---|---|---|
consurf_outputs |
Consurf_Outputs.zip |
data | zip | Complete ConSurf analysis results archive |
conservation_grades |
*_consurf_grades.txt |
data | txt | Per-residue conservation grades and scores |
msa_variety |
msa_aa_variety_percentage.csv |
data | csv | Amino acid variation percentage at each position in MSA |
query_copy |
query.fasta |
data | fasta | Copy of input query sequence |
Requirements
- Python Version: >=3.8
Example Data
This plugin includes example data for testing:
max_iterations: 1
chain: A
query_name: LRRK2_HUMAN
algorithm: HMMER
max_homologs: 50
substitution_model: BEST
maximum_likelihood: false
query_sequence: example/Q5S007.fasta.txt
fasta_database: example/LRRK2_Mammalia.txt
structure_file: example/AF-Q5S007-F1-model_v6.pdb
max_id: 95
min_id: 35
cutoff: 0.0001
Load example data by clicking the Load Example button in the UI.
Usage
Via UI
- Navigate to analysis → ConSurf Conservation Analysis
- Fill in the required inputs
- Click Run Analysis
Via Plugin System
const jobId = await pluginService.executePlugin('consurf-conservation', {
// Add parameters here
});