Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Extracts the chemicals, cas numbers, and functional uses from the ACI ingredient search. Only the ingredient inventory section was extracted (no supporting compounds). The web page was saved as a pdf then converted to a text file using the pdftotext application version 4.02. The text file split many of the chemical names and functional uses onto multiple lines. This is dealt with by one, separately dealing with chemicals that start with numbers and letters, and two, noting the cas numbers for the first and last chemicals starting with each letter since the list is alphabetical. This allowed the names to be pieced back together. This process worked decently well but there were exceptions that were dealt with accordingly.