Web cleaner Select linguistically relevant parts of HTML pages and convert them into plain text Author Gwénolé Lecorvé, IRISA Usage perl html2txt.pl <html_file>