Open In Colab

View Source on GitHub

[1]:
!git clone --recursive https://github.com/QData/FastSK.git
%cd /content/FastSK
!pwd
!pip install -r requirements.txt
!pip install .
fatal: destination path 'FastSK' already exists and is not an empty directory.
/content/FastSK
/content/FastSK
Requirement already satisfied: certifi==2020.4.5.1 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 1)) (2020.4.5.1)
Requirement already satisfied: joblib==0.14.1 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 2)) (0.14.1)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 3)) (1.19.5)
Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 4)) (1.1.5)
Requirement already satisfied: python-dateutil==2.8.1 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 5)) (2.8.1)
Requirement already satisfied: pytz==2019.3 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 6)) (2019.3)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 7)) (0.22.2.post1)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 8)) (1.4.1)
Requirement already satisfied: six==1.14.0 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 9)) (1.14.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 10)) (4.41.1)
Processing /content/FastSK
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: certifi==2020.4.5.1 in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (2020.4.5.1)
Requirement already satisfied: six==1.14.0 in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (1.14.0)
Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (1.1.5)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (4.41.1)
Requirement already satisfied: python-dateutil==2.8.1 in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (2.8.1)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (0.22.2.post1)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (1.19.5)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (1.4.1)
Requirement already satisfied: joblib==0.14.1 in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (0.14.1)
Requirement already satisfied: pytz==2019.3 in /usr/local/lib/python3.7/dist-packages (from fastsk==0.0.0) (2019.3)
Building wheels for collected packages: fastsk
  Building wheel for fastsk (PEP 517) ... done
  Created wheel for fastsk: filename=fastsk-0.0.0-cp37-cp37m-linux_x86_64.whl size=133348 sha256=c316913c55b35022745089b29f50cb79559745006e92a0bea038af7b0b7678fc
  Stored in directory: /tmp/pip-ephem-wheel-cache-2svexk9m/wheels/1b/bf/35/b0f99e1fd166eea045cc19321a8ee175d5f0b4a73f4acc4a76
Successfully built fastsk
Installing collected packages: fastsk
  Found existing installation: fastsk 0.0.0
    Uninstalling fastsk-0.0.0:
      Successfully uninstalled fastsk-0.0.0
Successfully installed fastsk-0.0.0

FastSK Demo

Here is a quick tutorial on how to use the methods in FastSK package.

[ ]:
from fastsk import FastSK

kernel= FastSK(g=3, m=2)

Xtrain = [[1,0,1,0,1], [1,1,1,0,1]]
Xtest = [[1,1,1,1,1], [1,0,1,0,1]]

kernel.compute_kernel(Xtrain, Xtest)

train_kernel = kernel.get_train_kernel()
test_kernel = kernel.get_test_kernel()


[ ]:
import seaborn as sns
import matplotlib.pyplot as plt

heat_map = sns.heatmap(train_kernel)
plt.show()

Using the main FastSK Class

fastsk.FastSK( int g, int m, int t=-1, bool approx=False, double delta=0.025, int max_iters=-1 bool skip_variance=False)

Constructor of the FastSK class. This creates a FastSK object with the specified parameters.

g: Required. The overall sequence feature length. FastSK will extract length-g contiguous features (or g-mers) from each training and test sequence.

m: Required. The number of mismatch positions to insert into each of the g-mers.

t: Optional. The number of threads to use to compute the kernel matrix.

approx Optional. Whether to use the FastSK approximation algorithm.

delta Optional. The delta parameter to use for the approximation algorithm. Controls how quickly the algorithm converges.

int Optional. The maximum number of iterations of the approximation algorithm to use.

skip_variance Optional. If max_iters is set, the skip_variance flag tells FastSK to iterate up to max_iters without performing variance computations when running.

FastSK.train_kernel()

FastSK.test_kernel()

train_kernel() returns the training portion of the kernel matrix. test_kernel() returns the testing portion of the kernel matrix.

“For example, given the set up below..”