Prediction page uses PlantiSMASH to perform gene cluster prediction based on user input, and provides gene cluster product types, wheat gene expression data, annotation information and result download functions.
1. Select species. The user selects the species of interest, with the gene ID format for that species in parentheses. Please take care to enter the correct gene ID.
2. Input genes. Click on the Example button to use the sample data. Please note that the input gene should be a column of genes without punctuation.
3. Select cluster product type. The user can select the cluster product type of interest in the multi-select box.
After entering the genes, click the Predict button in the mainpanel to prediction.
Please note that if the user enters too few genes, selects too few cluster product types, or has an incorrect gene ID format, no prediction results will be available. Please enter more genes or check the gene ID format to retry.
After a successful prediction, the results are displayed in the table below. Information on genes, clusters, cluster product types, chromosomes, start and stop sites, gene function, and wheat gene expression data are included. Expression data are from IWGSC .
Filter provides a filtering function for predicted gene clusters from both co-expression (using WGCNA ) and co-pathway (using Pathway Tools ) levels.
Please note, this function is only available for wheat gene clusters for now.
4.1. Co-expression. Select Yes for co-expression filtering, select No for no filtering. The dataset on the right side indicates the six datasets for constructing co-expression network, and users can make multiple selections as needed.
The co-expression filtering condition is that at least two genes in the cluster are located in the same co-expression module. The co-expression network was constructed using 850 sets of wheat transcriptome data from IWGSC with WGCNA.
4.2. Co-pathway. Select Yes for co-pathway filtering, select No for no filtering.
The co-pathway filtering condition was that at least two genes within the cluster were located in the same metabolic pathway.
The co-pathway analysis was performed using Pathway tools, a plant genomic metabolic pathway prediction tool, for genome-wide metabolic pathway annotation in common wheat.
Please note the format of the input genes and the filtering conditions, stricter filtering conditions may result in fewer or no results.
The filtering results are presented in the table below. The table includes information on gene ID, gene cluster, cluster product type, chromosome, start/stop site and dataset used for co-expression network construction.
Dataset column indicates 6 different datasets.
Users can click the Download button to download the prediction results. The Gene column and Cluster column of the result file can be used for subsequent analysis.
Pathway page shows the pathway map of gene functions from KEGG and PATHWAY TOOLS.
Please note, this function is only available for wheat gene clusters for now.
Click the Show button to display gene cluster information in the table below.
This includes information on gene clusters, chromosomes, cluster product types, start/end loci, pathway map, and map name.
Module column indicates 61 different co-expression modules included in the 6 datasets, NONE indicates that the gene is not annotated to any co-expression module.
PathwayID column indicates the pathway ID that the gene is annotated to, and NONE indicates that the gene is not annotated to any pathway.
Click on the cluster product types to query the compound in KEGG. Click the Download button to download the table.
Click on a row to expand it showing the IWGSC ID, NCBI ID, domain, gene name and pathway map of the gene. IWGSC IDs and NCBI IDs are transformed by BLAST .
Click on the IWGSC ID to query for the gene in Ensembl Plants . Click on the NCBI ID to query the gene in KEGG .
After single-selecting a row in the table, click the Get button in the map panel to get the pathway map of selected cluster.
The green box indicates the enzyme corresponding to the gene in that cluster in the pathway. Mouse hover or click on the green box to get information on the specific gene corresponding to that enzyme.
Please refer to the KEGG official website for the specific usage of pathway map.
Collinearity page provides functions for collinearity analysis of wheat gene clusters with their homologous species. Collinearity analysis and visualization is performed using MCscan (Python version JCVI package).
1. Homologous sepcies. Select homologous species of interest and perform gene cluster collinearity analysis.
2. Input cluster IDs. Enter a gene cluster IDs. Click the Example button to use the example data.
Please note that the entered cluster ID should be a column of cluster IDs without punctuation, and the uploaded file should be a csv file including a column of cluster IDs without punctuation.
3. Select cluster product type. The user can select the cluster product type of interest in the multi-select box.
After selecting the chromosome and species, click the Show button to display the gene cluster covariance information in the table below.
The table includes information on wheat gene clusters, cluster product types, chromosomes, start and stop loci, homologous species, chromosomes of homologous species, and start and stop loci of homologous species.
Please note that the table includes all wheat gene clusters that meet the criteria, whether or not they exist on homologous species chromosomes for collinear genes.
Click on the triangle button on the right to display collinear gene pairs.
Please note that the expanded table only shows the pairs of genes that are colinear on the chromosome of the homologous species for that cluster species, not all genes within the cluster.
Where Collinearity column is listed as '-' (e.g. c1), it means that the wheat gene cluster has no collinear genes on the chromosomes of that homologous species.
Click the Download cluster list button to download the collinear cluster pair information. Click the Download gene list button to download the collinear gene pair information. The Cluster column of the file can be used for subsequent analysis.
Single select a row and click the Analyse button to perform a collinearity visualization. The analysis will take about 30 seconds, during which a progress bar will be displayed. Please be patient.
After successful analysis, the visual collinearity map is presented below.
Chromosome information, start and stop loci and collinear gene pair IDs are displayed as labels in the graph.
The direction of the arrow indicates the positive and negative strand information of the gene.
The color of the arrows indicates homology , i.e. arrows with the same color on both chromosomes indicate genes with homozygosity.
Click the Download button to download the original image.
Similarity page provides gene cluster similarity analysis function using BIG-SACPE.
1. Select species. Users can multi-select the species of interest.
Enter the chromosome start and end loci according to the selected species. Please note that ensure that the chromosome range entered is large enough to display the desired gene clusters.
2. Input cluster IDs. Enter a gene cluster ID or upload a gene cluster ID file. Click the Example button to use the example data.
Please note that the entered cluster ID should be a column of cluster IDs without punctuation, and the uploaded file should be a csv file including a column of cluster IDs without punctuation.
3. Select cluster product type. The user can select the cluster product type of interest in the multi-select box.
Please note that if the user enters too few gene cluster IDs, selects too few cluster product types, or has an incorrect gene cluster ID format, no prediction results will be available.
Please enter more gene cluster IDs, check the format, or select more cluster product types to retry.
Click the Show button to display gene cluster information in the table below.
This includes information on species, gene clusters, chromosomes, cluster product types, and start and end loci.
It is recommended to select all or multiple gene clusters to obtain reliable similarity analysis results.
Click the Analyse button to perform a similarity analysis. The analysis will take about 30 seconds, during which a progress bar will be displayed. Please be patient.
After successful analysis, click the Download button to download the result file in tar.gz format. The index.html in the file is an interactive display of the analysis results.