Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
lkoval
5-2-19

Extracts the chemical names and cas numbers from Annex 1 in the European Commission Regulation 1451. Extraction used tabula-py version 1.1.1 in Python 3.7.2. Extraction took place out of order due to inconsistencies in how tabula read each page. Some pages were read with an additional column (4 total as opposed to expected 3). This changed which column some values were in. All pages with expected 3 columns were extracted together followed by the pages that had 4 columns. Additionally, pages 32 & 36 were extracted individually due to additional formatting errors. A general order is along lines of 13,15,16,18,21,25,26,28,30,31,33,34,11,12,14,17,19,20,22,23,24,27,29,35,32,36.