A New CSRML Structure-Based Fingerprint Method for Profiling and Categorizing Per- and Polyfluoroalkyl Substances (PFAS)

Chem Res Toxicol. 2023 Mar 20;36(3):508-534. doi: 10.1021/acs.chemrestox.2c00403. Epub 2023 Mar 2.

Abstract

The term PFAS encompasses diverse per- and polyfluorinated alkyl (and increasingly aromatic) chemicals spanning industrial processes, commercial uses, environmental occurrence, and potential concerns. With increased chemical curation, currently exceeding 14,000 structures in the PFASSTRUCTV5 inventory on EPA's CompTox Chemicals Dashboard, has come increased motivation to profile, categorize, and analyze the PFAS structure space using modern cheminformatics approaches. Making use of the publicly available ToxPrint chemotypes and ChemoTyper application, we have developed a new PFAS-specific fingerprint set consisting of 129 TxP_PFAS chemotypes coded in CSRML, a chemical-based XML-query language. These are split into two groups, the first containing 56 mostly bond-type ToxPrints modified to incorporate attachment to either a CF group or F atom to enforce proximity to the fluorinated portion of the chemical. This focus resulted in a dramatic reduction in TxP_PFAS chemotype counts relative to the corresponding ToxPrint counts (averaging 54%). The remaining TxP_PFAS chemotypes consist of various lengths and types of fluorinated chains, rings, and bonding patterns covering indications of branching, alternate halogenation, and fluorotelomers. Both groups of chemotypes are well represented across the PFASSTRUCT inventory. Using the ChemoTyper application, we show how the TxP_PFAS chemotypes can be visualized, filtered, and used to profile the PFASSTRUCT inventory, as well as to construct chemically intuitive, structure-based PFAS categories. Lastly, we used a selection of expert-based PFAS categories from the OECD Global PFAS list to evaluate a small set of analogous structure-based TxP_PFAS categories. TxP_PFAS chemotypes were able to recapitulate the expert-based PFAS category concepts based on clearly defined structure rules that can be computationally implemented and reproducibly applied to process large PFAS inventories without need to consult an expert. The TxP_PFAS chemotypes have the potential to support computational modeling, harmonize PFAS structure-based categories, facilitate communication, and allow for more efficient and chemically informed exploration of PFAS chemicals moving forward.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.
  • Research Support, N.I.H., Extramural

MeSH terms

  • Cheminformatics*
  • Computer Simulation
  • Fluorocarbons* / chemistry

Substances

  • Fluorocarbons