get_nexis_html function

extract texts and meta data from Nexis HTML files