Submit chemicals

In the first step users can upload up to 100 chemicals using SMILES string format or using CASRN ID or DSSTox ID (see EPA comptox). Inputs are standardized, filtered, and mixtures, ions and inorganic chemicals are excluded using the MolVS Python3.6 library.

Users can refine the chemical list uploaded at this step.

Compute descriptors

For each chemical a set of 677 structural descriptors is computed including 1D and 2D descriptors as well as physico-chemical descriptors using RDKit tool kit implemented in Python 3.6 and OPERA models. The computational time is around 2-5s per chemical.

Users can download the descriptors matrix in csv format from RDKit tool kit and OPERA model at this step.

Example of descriptor computation results

Interference model prediction

Once descriptors have been computed, users can predict chemical interference activity by choosing one or several of the 17 interreference models proposed. Each model is an average of 10 different random forest models built to cover the full set of chemicals (cf. publication). The table below describes the available models.

All random forest models used can be downloaded in a tar.gz archive with the assciated Rscript to run them individually.

Model

Type of study

Wave length (color)

Cell culture

Condition

Luciferase

A-All

Autofluorescence

Blue, Green and Red

HepG2 and HEK293

Cell-based and cell-free

A-Blue

Autofluorescence

Blue

HepG2 and HEK293

Cell-based and cell-free

A-Green

Autofluorescence

Green

HepG2 and HEK293

Cell-based and cell-free

A-Red

Autofluorescence

Red

HepG2 and HEK293

Cell-based and cell-free

A-Blue HepG2 Cell-based

Autofluorescence

Blue

HepG2

Cell-based

A-Blue HEK293 Cell-based

Autofluorescence

Blue

HEK293

Cell-based

A-Blue HepG2 Cell-Free

Autofluorescence

Blue

HepG2

Cell-free

A-Blue HEK293 Cell-Free

Autofluorescence

Blue

HEK293

Cell-free

A-Green HepG2 Cell-based

Autofluorescence

Green

HepG2

Cell-based

A-Green HEK293 Cell-based

Autofluorescence

Green

HEK293

Cell-based

A-Green HepG2 Cell-Free

Autofluorescence

Green

HepG2

Cell-free

A-Green HEK293 Cell-Free

Autofluorescence

Green

HEK293

Cell-free

A-Red HepG2 Cell-based

Autofluorescence

Red

HepG2

Cell-based

A-Red HEK293 Cell-based

Autofluorescence

Red

HEK293

Cell-based

A-Red HepG2 Cell-Free

Autofluorescence

Red

HepG2

Cell-free

A-Red HEK293 Cell-Free

Autofluorescence

Red

HEK293

Cell-free

It takes around 1-2s per chemical and per model to compute the predictions.

For quality check and speed purpose, each chemical no included in the DSSTox database will be add in our internal database by default. Users can choose to not save chemicals in the databse at this step.

Results

Results are presented in a dynamic table. For each model selected a score between 0 and 1 is reported with an associated standard deviation. The score is a probability for a chemical to interfere with the technology, cell culture and condition related to the model. A score close to 1 signifies that the chemical has a high chance to interfere with that particular technology and experimental condition. The standard deviation is derived from the deviation of the ten random forest model predictions.

Users can download the table of results in a csv format.

Please cite: A. Borrel et al.; High-Throughput Screening to Predict Chemical-Assay Interference