Ureader.com  
Microsoft software help and Community
   home   |   control panel login   |   archive   |  
 
Word
application.errors
conversions
docmanagement
drawing.graphics
formatting.longdocs
international
internet.assistant
mail
mailmerge.fields
menustoolbars
newusers
numbering
oleinterop
pagelayout
printingfonts
setup.networking
spelling.grammar
tables
vba.addins
vba.beginners
vba.customization
vba.general
vba.userforms
web.authoring
word6-7macros
word97vba
  
 
date: Thu, 10 Apr 2008 21:39:16 -0700,    group: microsoft.public.word.vba.beginners        back       


Stepping through a document   
I have dozens of Word documents that are composed mostly of tables, and a 
line that labels the table.
There may be many tables in one document. Some documents have hundred of 
rows in many tables.

e.g:

 Table name or descriptor
 column one data        column2 data
 next Col1 data           corresponding col2 data


 Next table name
 col1 Data                   Column two data
 Another Column1       It's column 2 data

Generally, there is one line of description for each table, followed by the 
table.

These Word documents were created over a space of several years, and were 
manually maintained, so that "generally" is key.  I'm sure that sprinkled 
throughout these files are notations, keyed-entries, explanatory texts, etc.

The spacing between tables and headers is not standardized.
The style is not standardized.

What I need to to export all these docs to either Excel or Access, and 
create a three-column table, with the table header as the first or third 
column - not important.

I can't wrap my head around a algorythm for this.  I can't do a for each.. 
next loop, I don't think.
Do I step through each paragraph, and assess if it is a row of a table?  If 
so, how do I do that?
How do I handle a situation if there is one row or one table that has three 
cells in it?

If I am stepping through paragraphs, and determine that it is a row in a 
table, how do I then stop through the cells in that row?

I don't know how to approach this, and any help will be GREATLY appreciated!
date: Thu, 10 Apr 2008 21:39:16 -0700   author:   Margaret Bartley

Re: Stepping through a document   
Hi Margaret,

Just to start you thinking, one way I might approach this is to go
through the tables collection in word,

eg (the pseudo code below)
dim myTable as table
for each myTable in activedocument
   ' do domething like get the first and last row of the table
   ' and list it in a report so that you could check that it is
picking up the info correctly
   'use myTable..Columns.Count to output the number of columns in the
table.
next

Having a handle on each table you could use the Range method to look
at the paragraph before the table and/or after it.  Again to check/
list that you have the correct table.

Hope this helps to get you thinking.

Cheers!
TonyS.


perhaps

On Apr 11, 2:39 pm, "Margaret Bartley"
 wrote:
> I have dozens of Word documents that are composed mostly of tables, and a
> line that labels the table.
> There may be many tables in one document. Some documents have hundred of
> rows in many tables.
>
> e.g:
>
>  Table name or descriptor
>  column one data        column2 data
>  next Col1 data           corresponding col2 data
>
>  Next table name
>  col1 Data                   Column two data
>  Another Column1       It's column 2 data
>
> Generally, there is one line of description for each table, followed by the
> table.
>
> These Word documents were created over a space of several years, and were
> manually maintained, so that "generally" is key.  I'm sure that sprinkled
> throughout these files are notations, keyed-entries, explanatory texts, etc.
>
> The spacing between tables and headers is not standardized.
> The style is not standardized.
>
> What I need to to export all these docs to either Excel or Access, and
> create a three-column table, with the table header as the first or third
> column - not important.
>
> I can't wrap my head around a algorythm for this.  I can't do a for each> next loop, I don't think.
> Do I step through each paragraph, and assess if it is a row of a table?  If
> so, how do I do that?
> How do I handle a situation if there is one row or one table that has three
> cells in it?
>
> If I am stepping through paragraphs, and determine that it is a row in a
> table, how do I then stop through the cells in that row?
>
> I don't know how to approach this, and any help will be GREATLY appreciated!
date: Sat, 19 Apr 2008 17:44:00 -0700 (PDT)   author:   Tony Strazzeri

Re: Stepping through a document   
Margaret,

If you create the following folders

C:\WordData
C:\WordData\Processed

and you move all of the documents from which you want to extract the data 
into the folder C:\WordData

The if you create a database named WordData in the C:\ folder and in it you 
create a table named tblWordData containing the following fields

Descriptor
Column1
Column2

and you create a Form in the database that contains a command button and if 
you have the following code in that command button Click event, it will 
import all of the data from all of the tables in all of the documents in the 
C:\WordData folder into the table tblWordData

Dim dbsWordData As Database
Dim rstWordData As Recordset
Dim wordApp As Object
Dim vDescriptor As String
Dim vColumn1 As String
Dim vColumn2 As String
Dim FldrPath As String
Dim RecordDoc As String
Dim Source As Object
Dim SourceTable As Table
Dim i As Long, j As Long, k As Long
Dim tblrng As Range
Dim tblname As Range
Dim datarng As Range

Set dbsWordData = OpenDatabase("c:\WordData.mdb")
Set rstWordData = dbsWordData.OpenRecordset("tblWordData", dbOpenDynaset)

On Error GoTo CreateWordApp
Set wordApp = GetObject(, "Word.Application")
wordApp.Visible = False
On Error Resume Next

FldrPath = "C:\WordData\"

RecordDoc = Dir$(FldrPath & "*.doc")
k = 0
While RecordDoc <> ""
    Set Source = wordApp.Documents.Open(FldrPath & RecordDoc)
    With Source
        For i = 1 To .Tables.Count
            Set tblrng = .Tables(i).Range
            Set tblname = tblrng.Duplicate
            tblname.MoveStart wdParagraph, -1
            tblname.End = tblname.Paragraphs(1).Range.End - 1
            vDescriptor = tblname.Text
            Set datarng = .Tables(i).Cell(1, 1).Range
            datarng.End = datarng.End - 1
            vColumn1 = datarng.Text
            Set datarng = .tables(i).Cell(1, 2).Range
            datarng.End = datarng.End - 1
            vColumn2 = datarng.Text
            With rstWordData
                .AddNew
                !Descriptor = vDescriptor
                !Column1 = vColumn1
                !Column2 = vColumn2
                .Update
                For j = 2 To Source.Tables(i).Rows.Count
                    Set datarng = Source.Tables(i).Cell(j, 1).Range
                    datarng.End = datarng.End - 1
                    vColumn1 = datarng.Text
                    Set datarng = Source.tables(i).Cell(j, 2).Range
                    datarng.End = datarng.End - 1
                    vColumn2 = datarng.Text
                    .AddNew
                    !Column1 = vColumn1
                    !Column2 = vColumn2
                    .Update
                Next j
            End With
        Next i
    End With
    k = k + 1
    Source.SaveAs "c:\WordData\Processed\" & Source.Name
    Kill "c:\WordData\" & Source
    Source.Close wdDoNotSaveChanges
    RecordDoc = Dir
Wend
MsgBox "Data Imported from " & k & " documents."
wordApp.Quit

   Set wordApp = Nothing
   Set WordDoc = Nothing
   Set rstWordData = Nothing
   Set dbsWordData = Nothing

CreateWordApp:
   Set wordApp = CreateObject("Word.Application")
   Resume Next

After each document is processed, it is moved into the folder 
C:\WordData\Processed and when all of the documents have been processed, a 
message box will be displayed with the message "Data imported from # 
documents."

Testing the above with two documents, each containing two tables with 
Document#Table# in a paragraph before each table, it imported the data into 
the tblWordData as follows:

Descriptor                    Column1        Column2
Document1Table 1       D1T1R1C1    D1T1R1C2
                                    D1T1R2C1    D1T1R2C2
                                    D1T1R3C1    D1T1R3C2
Document1Table 2       D1T2R1C1    D1T2R1C2
                                    D1T2R2C1    D1T2R2C2
                                    D1T2R3C1    D1T2R3C2
Document2Table 1       D2T1R1C1    D2T1R1C2
                                    D2T1R2C1    D2T1R2C2
                                    D2T1R3C1    D2T1R3C2
Document2Table 2       D2T2R1C1    D2T2R1C2
                                    D2T2R2C1    D2T2R2C2
                                    D2T2R3C1    D2T2R3C2

Whether or not it will work for you will depend upon the meaning of your 
word "generally".

-- 
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

"Margaret Bartley"  wrote in 
message news:OMNqqgZoIHA.3976@TK2MSFTNGP03.phx.gbl...
>I have dozens of Word documents that are composed mostly of tables, and a 
>line that labels the table.
> There may be many tables in one document. Some documents have hundred of 
> rows in many tables.
>
> e.g:
>
> Table name or descriptor
> column one data        column2 data
> next Col1 data           corresponding col2 data
>
>
> Next table name
> col1 Data                   Column two data
> Another Column1       It's column 2 data
>
> Generally, there is one line of description for each table, followed by 
> the table.
>
> These Word documents were created over a space of several years, and were 
> manually maintained, so that "generally" is key.  I'm sure that sprinkled 
> throughout these files are notations, keyed-entries, explanatory texts, 
> etc.
>
> The spacing between tables and headers is not standardized.
> The style is not standardized.
>
> What I need to to export all these docs to either Excel or Access, and 
> create a three-column table, with the table header as the first or third 
> column - not important.
>
> I can't wrap my head around a algorythm for this.  I can't do a for each.. 
> next loop, I don't think.
> Do I step through each paragraph, and assess if it is a row of a table? 
> If so, how do I do that?
> How do I handle a situation if there is one row or one table that has 
> three cells in it?
>
> If I am stepping through paragraphs, and determine that it is a row in a 
> table, how do I then stop through the cells in that row?
>
> I don't know how to approach this, and any help will be GREATLY 
> appreciated!
>
>
date: Sun, 20 Apr 2008 17:46:47 +1000   author:   Doug Robbins - Word MVP

Re: Stepping through a document   
Thank you! That looks like exactly what I want.
"Doug Robbins - Word MVP"  wrote in message 
news:umyF2qroIHA.4672@TK2MSFTNGP05.phx.gbl...
> Margaret,
>
> If you create the following folders
>
> C:\WordData
> C:\WordData\Processed
>
> and you move all of the documents from which you want to extract the data 
> into the folder C:\WordData
>
> The if you create a database named WordData in the C:\ folder and in it 
> you create a table named tblWordData containing the following fields
>
> Descriptor
> Column1
> Column2
>
> and you create a Form in the database that contains a command button and 
> if you have the following code in that command button Click event, it will 
> import all of the data from all of the tables in all of the documents in 
> the C:\WordData folder into the table tblWordData
>
> Dim dbsWordData As Database
> Dim rstWordData As Recordset
> Dim wordApp As Object
> Dim vDescriptor As String
> Dim vColumn1 As String
> Dim vColumn2 As String
> Dim FldrPath As String
> Dim RecordDoc As String
> Dim Source As Object
> Dim SourceTable As Table
> Dim i As Long, j As Long, k As Long
> Dim tblrng As Range
> Dim tblname As Range
> Dim datarng As Range
>
> Set dbsWordData = OpenDatabase("c:\WordData.mdb")
> Set rstWordData = dbsWordData.OpenRecordset("tblWordData", dbOpenDynaset)
>
> On Error GoTo CreateWordApp
> Set wordApp = GetObject(, "Word.Application")
> wordApp.Visible = False
> On Error Resume Next
>
> FldrPath = "C:\WordData\"
>
> RecordDoc = Dir$(FldrPath & "*.doc")
> k = 0
> While RecordDoc <> ""
>    Set Source = wordApp.Documents.Open(FldrPath & RecordDoc)
>    With Source
>        For i = 1 To .Tables.Count
>            Set tblrng = .Tables(i).Range
>            Set tblname = tblrng.Duplicate
>            tblname.MoveStart wdParagraph, -1
>            tblname.End = tblname.Paragraphs(1).Range.End - 1
>            vDescriptor = tblname.Text
>            Set datarng = .Tables(i).Cell(1, 1).Range
>            datarng.End = datarng.End - 1
>            vColumn1 = datarng.Text
>            Set datarng = .tables(i).Cell(1, 2).Range
>            datarng.End = datarng.End - 1
>            vColumn2 = datarng.Text
>            With rstWordData
>                .AddNew
>                !Descriptor = vDescriptor
>                !Column1 = vColumn1
>                !Column2 = vColumn2
>                .Update
>                For j = 2 To Source.Tables(i).Rows.Count
>                    Set datarng = Source.Tables(i).Cell(j, 1).Range
>                    datarng.End = datarng.End - 1
>                    vColumn1 = datarng.Text
>                    Set datarng = Source.tables(i).Cell(j, 2).Range
>                    datarng.End = datarng.End - 1
>                    vColumn2 = datarng.Text
>                    .AddNew
>                    !Column1 = vColumn1
>                    !Column2 = vColumn2
>                    .Update
>                Next j
>            End With
>        Next i
>    End With
>    k = k + 1
>    Source.SaveAs "c:\WordData\Processed\" & Source.Name
>    Kill "c:\WordData\" & Source
>    Source.Close wdDoNotSaveChanges
>    RecordDoc = Dir
> Wend
> MsgBox "Data Imported from " & k & " documents."
> wordApp.Quit
>
>   Set wordApp = Nothing
>   Set WordDoc = Nothing
>   Set rstWordData = Nothing
>   Set dbsWordData = Nothing
>
> CreateWordApp:
>   Set wordApp = CreateObject("Word.Application")
>   Resume Next
>
> After each document is processed, it is moved into the folder 
> C:\WordData\Processed and when all of the documents have been processed, a 
> message box will be displayed with the message "Data imported from # 
> documents."
>
> Testing the above with two documents, each containing two tables with 
> Document#Table# in a paragraph before each table, it imported the data 
> into the tblWordData as follows:
>
> Descriptor                    Column1        Column2
> Document1Table 1       D1T1R1C1    D1T1R1C2
>                                    D1T1R2C1    D1T1R2C2
>                                    D1T1R3C1    D1T1R3C2
> Document1Table 2       D1T2R1C1    D1T2R1C2
>                                    D1T2R2C1    D1T2R2C2
>                                    D1T2R3C1    D1T2R3C2
> Document2Table 1       D2T1R1C1    D2T1R1C2
>                                    D2T1R2C1    D2T1R2C2
>                                    D2T1R3C1    D2T1R3C2
> Document2Table 2       D2T2R1C1    D2T2R1C2
>                                    D2T2R2C1    D2T2R2C2
>                                    D2T2R3C1    D2T2R3C2
>
> Whether or not it will work for you will depend upon the meaning of your 
> word "generally".
>
> -- 
> Hope this helps.
>
> Please reply to the newsgroup unless you wish to avail yourself of my
> services on a paid consulting basis.
>
> Doug Robbins - Word MVP
>
> "Margaret Bartley"  wrote 
> in message news:OMNqqgZoIHA.3976@TK2MSFTNGP03.phx.gbl...
>>I have dozens of Word documents that are composed mostly of tables, and a 
>>line that labels the table.
>> There may be many tables in one document. Some documents have hundred of 
>> rows in many tables.
>>
>> e.g:
>>
>> Table name or descriptor
>> column one data        column2 data
>> next Col1 data           corresponding col2 data
>>
>>
>> Next table name
>> col1 Data                   Column two data
>> Another Column1       It's column 2 data
>>
>> Generally, there is one line of description for each table, followed by 
>> the table.
>>
>> These Word documents were created over a space of several years, and were 
>> manually maintained, so that "generally" is key.  I'm sure that sprinkled 
>> throughout these files are notations, keyed-entries, explanatory texts, 
>> etc.
>>
>> The spacing between tables and headers is not standardized.
>> The style is not standardized.
>>
>> What I need to to export all these docs to either Excel or Access, and 
>> create a three-column table, with the table header as the first or third 
>> column - not important.
>>
>> I can't wrap my head around a algorythm for this.  I can't do a for 
>> each.. next loop, I don't think.
>> Do I step through each paragraph, and assess if it is a row of a table? 
>> If so, how do I do that?
>> How do I handle a situation if there is one row or one table that has 
>> three cells in it?
>>
>> If I am stepping through paragraphs, and determine that it is a row in a 
>> table, how do I then stop through the cells in that row?
>>
>> I don't know how to approach this, and any help will be GREATLY 
>> appreciated!
>>
>>
>
>
date: Tue, 22 Apr 2008 05:03:25 -0700   author:   Margaret Bartley

Google
 
Web ureader.com


    COPYRIGHT 2007, YARDI TECHNOLOGY LIMITED, ALL RIGHT RESERVE  |   contact us