Skip to content

Commit

Permalink
Merge pull request #18 from pdec/master
Browse files Browse the repository at this point in the history
Ready for v3.4
  • Loading branch information
linsalrob authored Oct 10, 2019
2 parents 93aca25 + 5bf33cf commit 7de1d69
Show file tree
Hide file tree
Showing 48 changed files with 630,144 additions and 357,888 deletions.
175,199 changes: 149,145 additions & 26,054 deletions data/genericAll.txt

Large diffs are not rendered by default.

2,554 changes: 2,554 additions & 0 deletions data/s122586.txt

Large diffs are not rendered by default.

2,477 changes: 2,477 additions & 0 deletions data/s122587.txt

Large diffs are not rendered by default.

5,705 changes: 5,705 additions & 0 deletions data/s155864.txt

Large diffs are not rendered by default.

2,770 changes: 2,770 additions & 0 deletions data/s158878.txt

Large diffs are not rendered by default.

5,728 changes: 5,728 additions & 0 deletions data/s160488.txt

Large diffs are not rendered by default.

1,854 changes: 1,854 additions & 0 deletions data/s160490.txt

Large diffs are not rendered by default.

3,070 changes: 3,070 additions & 0 deletions data/s160492.txt

Large diffs are not rendered by default.

2,925 changes: 2,925 additions & 0 deletions data/s169963.txt

Large diffs are not rendered by default.

2,796 changes: 2,796 additions & 0 deletions data/s183190.txt

Large diffs are not rendered by default.

1,953 changes: 1,953 additions & 0 deletions data/s186103.txt

Large diffs are not rendered by default.

4,786 changes: 4,786 additions & 0 deletions data/s187410.txt

Large diffs are not rendered by default.

4,903 changes: 4,903 additions & 0 deletions data/s190486.txt

Large diffs are not rendered by default.

3,788 changes: 3,788 additions & 0 deletions data/s190650.txt

Large diffs are not rendered by default.

2,780 changes: 2,780 additions & 0 deletions data/s195102.txt

Large diffs are not rendered by default.

2,624 changes: 2,624 additions & 0 deletions data/s196620.txt

Large diffs are not rendered by default.

4,983 changes: 4,983 additions & 0 deletions data/s198214.txt

Large diffs are not rendered by default.

1,948 changes: 1,948 additions & 0 deletions data/s198466.txt

Large diffs are not rendered by default.

5,188 changes: 5,188 additions & 0 deletions data/s199310.txt

Large diffs are not rendered by default.

1,920 changes: 1,920 additions & 0 deletions data/s206672.txt

Large diffs are not rendered by default.

5,855 changes: 5,855 additions & 0 deletions data/s208964.txt

Large diffs are not rendered by default.

4,840 changes: 4,840 additions & 0 deletions data/s211586.txt

Large diffs are not rendered by default.

2,762 changes: 2,762 additions & 0 deletions data/s212717.txt

Large diffs are not rendered by default.

4,605 changes: 4,605 additions & 0 deletions data/s214092.txt

Large diffs are not rendered by default.

5,469 changes: 5,469 additions & 0 deletions data/s220341.txt

Large diffs are not rendered by default.

4,494 changes: 4,494 additions & 0 deletions data/s224308.txt

Large diffs are not rendered by default.

3,363 changes: 3,363 additions & 0 deletions data/s224914.txt

Large diffs are not rendered by default.

3,326 changes: 3,326 additions & 0 deletions data/s243230.txt

Large diffs are not rendered by default.

3,776 changes: 3,776 additions & 0 deletions data/s243277.txt

Large diffs are not rendered by default.

7,876 changes: 7,876 additions & 0 deletions data/s266835.txt

Large diffs are not rendered by default.

5,400 changes: 5,400 additions & 0 deletions data/s267608.txt

Large diffs are not rendered by default.

4,297 changes: 4,297 additions & 0 deletions data/s272558.txt

Large diffs are not rendered by default.

2,456 changes: 2,456 additions & 0 deletions data/s272623.txt

Large diffs are not rendered by default.

3,072 changes: 3,072 additions & 0 deletions data/s272626.txt

Large diffs are not rendered by default.

2,070 changes: 2,070 additions & 0 deletions data/s272843.txt

Large diffs are not rendered by default.

1,811 changes: 1,811 additions & 0 deletions data/s71421.txt

Large diffs are not rendered by default.

4,317 changes: 4,317 additions & 0 deletions data/s83331.txt

Large diffs are not rendered by default.

4,298 changes: 4,298 additions & 0 deletions data/s83332.txt

Large diffs are not rendered by default.

4,552 changes: 4,552 additions & 0 deletions data/s83333.txt

Large diffs are not rendered by default.

5,755 changes: 5,755 additions & 0 deletions data/s83334.txt

Large diffs are not rendered by default.

24 changes: 14 additions & 10 deletions modules/classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,19 @@ def call_randomforest(**kwargs):
bin_path = os.path.join(os.path.dirname(os.path.dirname(os.path.relpath(__file__))),'bin')
infile = os.path.join(output_dir, "testSet.txt")
outfile = os.path.join(output_dir, "classify.tsv")
#trainingFile = find_training_genome(trainingFlag,INSTALLATION_DIR)
#train_data = np.genfromtxt(fname=trainingFile, delimiter="\t", skip_header=1, filling_values=1)
#test_data = np.genfromtxt(fname=infile, delimiter="\t", skip_header=1, filling_values=1)
#clf = RandomForestClassifier()
#clf.fit(train_data[:, :-1], train_data[:, -1].astype('int'))
#print(clf.predict(test_data))
#exit()
cmd = "Rscript " + bin_path + "/randomForest.r " + trainingFile + " " + infile + " " + outfile
os.system(cmd)
train_data = np.genfromtxt(fname=trainingFile, delimiter="\t", skip_header=1, filling_values=1) # why not fill missing values with 0?
test_data = np.genfromtxt(fname=infile, delimiter="\t", skip_header=1, filling_values=1)
# Przemek's comment
# by default 10 until version 0.22 where default is 100
# number of estimators also implies the precision of probabilities, generally 1/n_estimators
# in R's randomForest it's 500 and the usage note regarding number of trees to grow says:
# "This should not be set to too small a number, to ensure that every input row gets predicted at least a few times."
clf = RandomForestClassifier(n_estimators = 500)
clf.fit(train_data[:, :-1], train_data[:, -1].astype('int'))
np.savetxt(outfile, clf.predict_proba(test_data)[:,1])

# cmd = "Rscript " + bin_path + "/randomForest.r " + trainingFile + " " + infile + " " + outfile
# os.system(cmd)

def my_sort(orf_list):
n = len(orf_list)
Expand Down Expand Up @@ -235,7 +239,7 @@ def make_initial_tbl(**kwargs): #organismPath, output_dir, window, INSTALLATION_
infile = open(os.path.join(self.output_dir, 'classify.tsv'), 'r')
outfile = open(os.path.join(self.output_dir, 'initial_tbl.tsv'), 'w')
except:
sys.exit('ERROR: Cannot open classify.tsv')
sys.exit('ERROR: Cannot open classify.tsv in make_initial_tbl')
#x = input_bactpp(**kwargs)
j = 1
ranks = [[] for n in range(len(x))]
Expand Down
Loading

0 comments on commit 7de1d69

Please sign in to comment.