The Goal of this project is to design and run data science experiments to test various transductive and semi-supervised learning algorithms
The TSVM theory is described on my blog http://charlesmartin14.wordpress.com/2014/07/06/machine-learning-with-missing-labels-transductive-svms/
The first objective is to test svmlin http://vikas.sindhwani.org/svmlin.html
against liblinear http://www.csie.ntu.edu.tw/~cjlin/liblinear/
using the binary datasets provided for libsvm http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
to determine how to tune svmlin well and what kind of data sets it performs well on
Later, we would like to look at Semi-Supervised learning algos such as
http://www.dii.unisi.it/~melacci/lapsvmp/
and the python scikit learn label propagation algo
http://scikit-learn.org/stable/modules/label_propagation.html
For Newbies: If you don't know anything about machine learning, you should first learn how to run liblinear on the libsvm data sets
We need someone to create a liblinear tutorial For now, you can see http://jamescpoole.com/2012/10/30/libsvm-tutorial-part-1-overview/
libsvm is almost identical to liblinear
To get started 0. requirements: ruby 2.x and gnu parallel ruby can be installed using rvm gnu parallel should be in the path
download and install liblinear and svmlin
download the a1a trainig and test data sets
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a.t
- edit svmlin, set the variables
SVMLIN_DIR = "~/packages/svmlin-v1.0"
LIBLINEAR_DIR = "~/packages/liblinear-1.94"
run svmlin.rb a1a
repeat for the a2a, a3a, ... data sets and the w2a, w3a, ... data sets