/usr/share/RDKit/Contrib/pzc/p_con.html is in rdkit-data 201603.5-2.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html><head><title>Python: class p_con</title>
</head><body bgcolor="#f0f0f8">
<p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ffc8d8">
<td colspan=3 valign=bottom> <br>
<font color="#000000" face="helvetica, arial"><strong>p_con.p_con</strong> = <a name="p_con.p_con">class p_con</a></font></td></tr>
<tr bgcolor="#ffc8d8"><td rowspan=2><tt> </tt></td>
<td colspan=2><tt>Class to create Models to classify Molecules active or inactive<br>
using threshold for value in training-data<br> </tt></td></tr>
<tr><td> </td>
<td width="100%">Methods defined here:<br>
<dl><dt><a name="p_con-__init__"><strong>__init__</strong></a>(self, acc_id<font color="#909090">=None</font>, proxy<font color="#909090">={}</font>)</dt><dd><tt>Constructor to initialize Object, use proxy if neccessary</tt></dd></dl>
<dl><dt><a name="p_con-__str__"><strong>__str__</strong></a>(self)</dt><dd><tt>String-Representation for Object</tt></dd></dl>
<dl><dt><a name="p_con-load_models"><strong>load_models</strong></a>(self, model_files)</dt><dd><tt>load model or list of models into self.<strong>model</strong></tt></dd></dl>
<dl><dt><a name="p_con-load_mols"><strong>load_mols</strong></a>(self, sd_file)</dt><dd><tt>load SD-File from .sdf, .sdf.gz or .sd.gz</tt></dd></dl>
<dl><dt><a name="p_con-predict"><strong>predict</strong></a>(self, model_number)</dt><dd><tt>try to predict activity of compounds using giving model-Number</tt></dd></dl>
<dl><dt><a name="p_con-save_model"><strong>save_model</strong></a>(self, outfile, model_number<font color="#909090">=0</font>)</dt><dd><tt>save Model to file using cPickle.dump</tt></dd></dl>
<dl><dt><a name="p_con-save_model_info"><strong>save_model_info</strong></a>(self, outfile, mode<font color="#909090">='html'</font>)</dt><dd><tt>create html- or csv-File for models according to mode (default: "html")</tt></dd></dl>
<dl><dt><a name="p_con-save_mols"><strong>save_mols</strong></a>(self, outfile, gzip<font color="#909090">=True</font>)</dt><dd><tt>create SD-File of current molecules in self.<strong>sd_entries</strong></tt></dd></dl>
<dl><dt><a name="p_con-step_0_get_chembl_data"><strong>step_0_get_chembl_data</strong></a>(self)</dt><dd><tt>Download Compound-Data for self.<strong>acc_id</strong>, these are available in self.<strong>sd_entries</strong> afterwards</tt></dd></dl>
<dl><dt><a name="p_con-step_1_keeplargestfrag"><strong>step_1_keeplargestfrag</strong></a>(self)</dt><dd><tt>remove all smaller Fragments per compound, just keep the largest</tt></dd></dl>
<dl><dt><a name="p_con-step_2_remove_dupl"><strong>step_2_remove_dupl</strong></a>(self)</dt><dd><tt>remove duplicates from self.<strong>sd_entries</strong></tt></dd></dl>
<dl><dt><a name="p_con-step_3_merge_IC50"><strong>step_3_merge_IC50</strong></a>(self)</dt><dd><tt>merge IC50 of duplicates into one compound using mean of all values if:<br>
min(IC50) => IC50_avg-3*IC50_stddev && max(IC50) <= IC50_avg+3*IC50_stddev && IC50_stddev <= IC50_avg</tt></dd></dl>
<dl><dt><a name="p_con-step_4_set_TL"><strong>step_4_set_TL</strong></a>(self, threshold, ic50_tag<font color="#909090">='value'</font>)</dt><dd><tt>set Property "TL"(TrafficLight) for each compound:<br>
if ic50_tag (default:"value") > threshold: TL = 0, else 1</tt></dd></dl>
<dl><dt><a name="p_con-step_5_remove_descriptors"><strong>step_5_remove_descriptors</strong></a>(self)</dt><dd><tt>remove list of Properties from each compound (hardcoded)<br>
which would corrupt process of creating Prediction-Models</tt></dd></dl>
<dl><dt><a name="p_con-step_6_calc_descriptors"><strong>step_6_calc_descriptors</strong></a>(self)</dt><dd><tt>calculate descriptors for each compound, according to Descriptors._descList</tt></dd></dl>
<dl><dt><a name="p_con-step_7_train_models"><strong>step_7_train_models</strong></a>(self)</dt><dd><tt>train models according to trafficlight using sklearn.ensamble.RandomForestClassifier<br>
self.<strong>model</strong> contains up to 10 models afterwards, use <a href="#p_con.p_con-save_model_info">save_model_info</a>(type) to create csv or html<br>
containing data for each model</tt></dd></dl>
</td></tr></table>
</body></html>
|