List of Workflows

The INB platform has developed a series of Web applications which are accessed through easy-to-use web pages and presented also as web services which can then define specific workflows. The current display lists of workflows:

 

Splash: Screening at the Protein level for alternative Splicing homologues. This server allows the identification and annotation of alternative splicing events which are equivalent among species. During 2007 the webserver was completely rebuild changing the interface and adding new databases for searching information (ASP, Ensembl). A new confidence index has been added as well as new links with related databases ands servers.
ProStar: During 2007-08 we finished the development of this webserver designed for the ab inicio prediction of promoters based on their unusual physical properties. The algorithm has been recently benchmarked showing superior behaviour than most of the currently available algorithms for promoter prediction.
DNAlive: During 2008 we have developed a new server for calculation and visualization of physical properties of DNA. The server contains a wide repertoire of physical descriptors and very powerful visualization tools including mesoscopic simulators that allows for the first time to introduce flexibility considerations in the modelling of chromatine structure.
Flexserv: During 2008 in collaboration with the BSC-node we have completed the development of this webserver that includes for the first time a complete description of protein flexibility based on: discrete molecular dynamics, ii) normal mode analysis, iii) Brownian dynamics and iv) when available molecular dynamics. The server provides information on domain and hinge locations, of correlated movements among residues on the placement of rigid cores and many other dynamic descriptors. The server will be linked along 2008 to other group-web applications like FSOLV or PMUT.
MDweb. Webpage that allow extremely easy access to the different services developed for automatically structure setup and simulation. See description in the Webservices and Workflow section (above).
PMUT and precomputed PMUT. The webpage, one of the most used one for the analysis of Mendelian pathologies and the associated databases have been updated during 2007-2008 and linked with other database. The constituting webservices have been included in different workflows.
FSOLV. Webserver for predicting the solvation properties of proteins. The constituting webservices have been included in different workflows.
jORCA, a user oriented software emerged on the ESP-SOL project and extended and generalized in the INB to facilitate integration and engage the use of INB webservices. The main features of this standalone client are:
• Connection to different services repositories: any MOBY Central; INB production and development servers and ACGT-project production server.
• Execution of services using advanced/simple mode for asynchronous and mirrored services.
• Datatypes, services and namespaces organized hierarchically in a tree with fast-search that automatically filters the services/ datatypes / namespaces that match the text.
• Advanced services and datatypes searching by embedding Magallanes plug-in.
• Easy datatype creation by embedding Caronte plug-in.
• User file system where the user objects can be browsed connected to ompatible services and automatically launched in pipeline fashion.
• Favorite-like style to allow users to organize their main tasks.
• Tracing and Loggin of executed services.
• Tested in Microsoft W-XP and Unix-like operating systems.
Genomics: The main entry point to our services and workflows is at http://genome.imim.es/webservices/index.html. During this period we have been working on the development of services and workflows for gene finding. Specifically, we have developed workflows for geneid and sgp. We have also developed a workflow for the clustering of genes on genome coordinates. We published in the journal Bioinformatics our workflow for the analysis of corregulated genes, and we are in the process of preparing a publication describing the workflow for gene clustering

Visual genomics: Additionally to the platform, two new Web services have been created and registered in the INB platform. The fist Web service retrieves a list of genes expressed in a given anatomical component in any of the 27 Theiler Stages (TS). The second one lists the anatomical components together with the Theiler Stages in which a given gene is expressed. In order to convert all these developments into a useful tool for the non-expert bioinformatician, GN7-CNB has built a visual genomics workflow in Taverna. By means of this workflow, the user can retrieve from an unknown human nucleotidic sequence, the anatomical components and TS (in mice) in which the orthologous
sequence is expressed. The steps to do imply the use of:

• INB Blast service against EMBL MUS MUSCULUS,
• INB parse service in charge of retrieving the EMBL identifiers from Blast report,
• INB converting service for translating EMBL id into MGI id necessary to query our system, and finally
• INB service lists the components where the problem sequence is expressed.

Co-expressed genes clustering workflows: Protocol description: clustering of genes based on the pairwise alignment of their TFBSs maps. Such protocol can be used to validate the clustering of gene expression data.

ESTs assembly:input: chromatograms data; Output: the read sequences and associated quality data as well as the phrap ass Gene predictions.

GeneID workflows: The following workflow takes a genomic sequence in FASTA format, runs geneid, then translates the gene predictions into a set of peptide sequences in order to run various software (blastp, pfam search etc.) that take peptide sequence input, giving some hints about the function of the predicted genes.

GP2 workflows: SGP2 software is an adaptation of geneid that takes into account synteny data, in order to improve the accurracy of the predictions. The synteny data are computed using a syntenic sequence from another organism and by performing tblastx search. The following workflow takes two syntenic sequences in FASTA format, runs tblastx, then runs SGP2 on the sequence you wish to perform the gene predictions, and finally translates the gene predictions into a set of peptide sequences.

Workflows running both geneid and SGP2: The following workflow runs both geneid and SGP2 and generates an annotation map using gff2ps to visualize the annotations in order to compare them.

 

List of workflows that are accessible from IWEE&M:


1) Automatic Annotation of Protein Function.

2) Automatic Annotations and Gene Ontology terms of Protein Function.

3) Characterization of peroxysomal metabolome and its evolutive origin.

4) Clustering of co-expressed genes in subsets showing similar configurations of TFBSs.

5) Clustering of co-expressed genes in subsets showing similar configurations of TFBSs.

6) ESTs assembly workflow.

7) Full analysis of gene expression data.

8) Gene detection by homology (D. Torrents, BSC.).

9) GeneID_Workflow_icapture.

10) Hot Spot Analysis of Protein Sequence Workflow.

11) Information Hyperlinked over Proteins (iHOP).

12) Predicting Functionally Important Residues.

13) Predicting Functionally Important Residues.

14) Preprocessing and clustering of gene expression data.

15) Preprocessing and differential expression test of gene expression data.

16) Preprocessing, differential expression test and FatiGO test.

17) Preprocessing, differential expression test and FatiScan test.

18) Protein Structure Basic Optimization Workflow.

19) Protein Structure Solvation Analysis Workflow.

20) runSGP2GFF.

21) SGP2_Geneid_Comparison.

22) Wf-iHOP-SOAP.

 
RECOMB 2010