I have an HTML document that contains numerous one-cell tables, each containing the name of a file. Is it possible to open an HTML document from VB.NET and iterate the tables and elements wthin the tables in order to get to the text within the cells? I'm trying to use the MSHTML DLL and seems likely that it has the capability, but I haven't figured out how to do it yet. The mshtml.HTMLDocument doesn't provide a Tables collection.
Even though there is not a Tables collection per se, you can create a functional one with getElementsByTagName("table") What follow is not VB, but is written in a scripting language called AutoIt... you should be able to readily adapt it however. Dale $ObjIE=ObjCreate("InternetExplorer.Application") With $ObjIE .Visible = True .Navigate("http://www.autoitscript.com/") while .ReadyState <> 4 Sleep(50) wend EndWith $document = $objIE.document $body = $document.getElementsByTagName("body").item(0) $tables = $body.getElementsByTagName("table") For $table in $tables $rows = $table.getElementsByTagName("tr") For $row in $rows $tds = $row.getElementsByTagName("td") For $td in $tds ; Display the HTML contents of each cell... ConsoleWrite($td.innerText) Next Next Next Exit "Phil Galey" wrote: > I have an HTML document that contains numerous one-cell tables, each > containing the name of a file. > > Is it possible to open an HTML document from VB.NET and iterate the tables > and elements wthin the tables in order to get to the text within the cells? > I'm trying to use the MSHTML DLL and seems likely that it has the > capability, but I haven't figured out how to do it yet. The > mshtml.HTMLDocument doesn't provide a Tables collection. > > >