Kemfor

The Platform

Six modules. One decision layer.

Predict toxicity across 13,561 species, understand why with gene-level attribution, and generate novel molecules optimized for selectivity — all from one interface.

Access the Platform →

Data Foundation

Ag Sensitivity Atlas

A normalized ecotoxicology and genomics dataset assembled for crop protection. Six source databases ingested, deduplicated, and standardized into a single queryable layer.

ECOTOXEPA/USGS ecotoxicology database — ~1.2M chemical-species measurements
ToxCastEPA high-throughput assay data — 2.78M in vitro bioactivity endpoints
CompToxEPA chemical safety dashboard — structure, physicochemical, hazard data
PubChemNIH chemical biology — bioactivity, structure, computed properties
EnsemblEMBL-EBI genome annotation — 22,069 pesticide target genes across 122 species with transcriptomes
Normalization28,355 compounds in atlas · 16,836 (59%) fully harmonized with CAS, InChIKey, SMILES

Predictive Intelligence

Predictive Models

KemBoost predicts toxicity across the 13,561 species indexed in the Atlas, with gene-level explanations for the 378 species that have reference genomes (122 with full Ensembl transcriptomes). Unlike published QSAR models limited to fewer than 10 species.

ArchitectureMulti-scale convolutional attention with factored bilinear interaction
Parameters168,000 — compact, fast, interpretable
PerformanceR²=0.723 on held-out test set across the benchmark species
Cross-speciesGeneralizes to species with no measured toxicity data using genomic embeddings, where genomes are available
ExplanationsGene-level attribution for species with transcriptome coverage
vs. PublishedStandard QSAR models: <10 species, no genomic context, no cross-species generalization

Molecule Generation

KemGen

De novo molecule design with built-in selectivity optimization. KemGen generates candidates simultaneously lethal to the target pest and safe for crops, bees, and humans.

SelectivityUp to 182x pest-vs-crop LC50 ratio (Corn vs. Flour Beetle)
Dimension 1Pest LC50 — predicted lethality concentration
Dimension 2Crop LC50 — predicted crop safety threshold
Dimension 3Bee LC50 — pollinator safety score
Dimension 4Human dietary LC50 — consumer safety estimate
Dim. 5–7QED synthesizability · Tox21 safety flags · Composite selectivity index

Crop Intelligence

Orchard

Seasonal threat intelligence at the field level. Maps pest pressure windows to crop growth stages, links active ingredients to ecotoxicological profiles, surfaces residue risk by geography.

Threat MapsPest pressure by crop × region × phenological stage
AI ResolutionActive ingredient identification from common product names
ProfilesFull ecotox data for each resolved ingredient, pulled from the Atlas
ResidueLinked to regulatory MRL data for key markets

Compound Search

Molecule Explorer

28,355 compounds searchable by name, CAS number, SMILES, or InChIKey (16,836 fully harmonized with structure identifiers). Every page shows a live 2D structure render, measured and predicted toxicity data, and linked ToxCast assay results.

SearchName, CAS, SMILES, InChIKey, or substructure
StructureLive 2D rendering from SMILES (RDKit/indigo in-browser)
ToxCast tabHTS bioactivity across 1,800+ distinct assay endpoints
ECOTOX tabMeasured LC50/EC50 values across species and endpoints
PredictionsKemBoost predictions for unmeasured species

Species Search

Organism Explorer

13,561 species across all kingdoms. Each page shows transcriptome availability, pesticide target gene profile, and all measured toxicity data in the Atlas.

KingdomsFilter by: Animalia, Plantae, Fungi, Bacteria, Protista
TranscriptomeBadge for species with Ensembl genome annotation
Target genesPesticide target gene expression levels per species
Toxicity dataAll measured LC50/EC50/NOEC values in the Atlas
PredictionsCross-species predictions from KemBoost