xflicsu.github.io

Software

AdaSampling (https://CRAN.R-project.org/package=AdaSampling)
Implements the adaptive sampling procedure, a framework for both positive unlabeled learning and learning with class label noise.

Reference:
Yang, P.^†, Ormerod, J., Liu, W., Ma, C., Zomaya, A., Yang, J. (2018) AdaSampling for positive-unlabeled and label noise learning with bioinformatics applications. IEEE Transactions on Cybernetics, doi:10.1109/TCYB.2018.2816984 [PDF], [Repo]

ClueR (https://CRAN.R-project.org/package=ClueR)
CLUster Evaluation (CLUE) is a computational method for identifying optimal number of clusters in a given time-course dataset clustered by cmeans or kmeans algorithms and subsequently identify key kinases or pathways from each cluster. Its implementation in R is called ClueR.

Reference:
Yang, P.^†, Zheng, X., Jayaswal, V., Hu, G., Yang, J. & Jothi, R. (2015). Knowledge-based analysis for detecting key signaling events from time-series phosphoproteomics data. PLoS Computational Biology, 11(8), e1004403. [Pubmed], [PDF]

directPA (https://cran.r-project.org/package=directPA)
Direction analysis is a set of tools designed to identify combinatorial effects of multiple treatments and/or perturbations on pathways and kinases profiled by microarray, RNA-seq, proteomics, or phosphoproteomics.

Reference:
Yang, P.^✢, Patrick, E.^✢, Tan, S., Fazakerley, D., Burchfield, J., Gribben, C., Prior, M., James, D. & Yang, J. (2014). Direction pathway analysis of large-scale proteomics data reveals novel features of the insulin action pathway. Bioinformatics, 30(6), 808-814. [Pubmed], [PDF]

KinasePA (http://shiny.maths.usyd.edu.au/KinasePA)
Kinase perturbation analysis (KinasePA) is a web tool that allows you to identify key kinases that are perturbed in two treatments compared to control conditions (such as basal or unstimulated conditions).

Description:
The input data should be a csv file separated by comma. The rows of the data file are phosphorylation sites and the columns are treatment1 vs control and treatment2 vs control. The values of the data file should be log2 fold changes. Here is an example File

KinasePA has also been incorporated into "directPA" R package. Install the package and find out more:
install.packages("directPA")

Reference:
Yang, P., Patrick, E., Humphrey, S., Ghazanfar, S., James, D., Jothi, R. & Yang, J. (2016). KinasePA: Phosphoproteomics data annotation using hypothesis driven kinase perturbation analysis. Proteomics, 16(13), 1868-1871

PUEL (https://github.com/PengyiYang/KSP-PUEL)
PUEL is an implementation of positive-unlabeled ensemble learning model for kinase-substrate prediction using kinase recognition motifs and dynamic phosphoproteomics data.

Prediction results for Akt, mTOR, AMPK, and ERK from different organisms using large-scale phosphoproteomics data are available from here.

Reference:
Yang, P., Humphrey, S., James, D., Yang, J. & Jothi, R. (2016). Positive-unlabeled ensemble learning for kinase substrate prediction from dynamic phosphoproteomics data. Bioinformatics, 32(2), 252-259.

SSO (http://www.maths.usyd.edu.au/u/pengyi/software/Sampling.html)
Sample subset optimization (SSO) is a sampling technique that utilize an evolutionary algorithm to optimize sample subsets for learning from imbalanced dataset. Please see more details in the reference below.

Reference:
Yang, P.^†, Yoo, P., Fernando, J., Zhou, B., Zhang, Z. & Zomaya, A. (2014). Sample subset optimization techniques for imbalanced and ensemble learning problems in bioinformatics applications. IEEE Transactions on Cybernetics, 44(3), 445-455. [IEEE Xplore] [PDF]

Legacy version of sample subset optimization package
https://code.google.com/p/sample-subset-optimization
A machine learning algorithm in Java for protein inference
http://code.google.com/p/re-fraction
A boosted learning algorithm in Java for peptide filtering
http://code.google.com/p/self-boosted-percolator
A parallel genetic algorithm in Java for SNP interaction detection
http://code.google.com/p/genetic-ensemble-snpx
An ensemble algorithm in Perl for SNP interaction filtering
http://code.google.com/p/ensemble-of-filters
An open source mass spectrometry analysis pipeline in R
http://code.google.com/p/ocap
A dynamic wavelet package in C/C++ for mass spectrum modeling
http://code.google.com/p/dywave/DyWave
A particle swarm optimisation algorithm in Java for imbalanced data sampling
http://code.google.com/p/imbalanced-data-sampling

Prediction results for Akt, mTOR, AMPK, and ERK from different organisms using large-scale phosphoproteomics data are available from here.