Ureader.com  
Microsoft software help and Community
   home   |   control panel login   |   archive   |  
 
XML
data.xmlanalysis
mappoint.webservice
msf
msxml-webrelease
netmyservices.sdk
passport.sdk
soap
soapsdk
uddi.general
uddi.programming
uddi.specification
xml
xmlsqlwebrelease
xsl
  
 
date: Wed, 11 Jun 2008 22:44:48 -0400,    group: microsoft.public.xml        back       


LINQ to XML   
How would I use LINQ to XML (vb .net) to extract just the text (not the 
markup) of the following:

<description><![CDATA[<p><a 
href="http://www.msnbc.msn.com/id/25085248/"><img align="left" border="0" 
src="http://msnbcmedia.msn.com/j/msnbc/Components/ArtAndPhoto-Fronts/TECH/080610/g-080610-tec-tencent-4x3-10a.thumb.jpg" 
alt="" style="margin:0 5px 5px 0" /></a>In an interview, David Hajdu, author 
of the book "The Ten-Cent Plague: The Great Comic-Book Scare and How it 
Changed America," discusses the rise and fall of  the rise and fall of 
comics, and how their unwelcome reception compares to today's criticism of 
violent video games.</p><br clear="all" />]]></description>

In other words, I'm only interested in:

In an interview, David Hajdu, author of the book "The Ten-Cent Plague: The 
Great Comic-Book Scare and How it Changed America," discusses the rise and 
fall of  the rise and fall of comics, and how their unwelcome reception 
compares to today's criticism of violent video games.

Thanks.
date: Wed, 11 Jun 2008 22:44:48 -0400   author:   Scott M. am

Re: LINQ to XML   
Given that the data is in a CDAT section, not a good way to store mark-up, 
then it is treated as text. To turn it into XML I think you'd have to 
extract the whole of it and reload it into an XElement/XDocument and then 
extract the text.


-- 

Joe Fawcett (MVP - XML)

http://joe.fawcett.name

"Scott M." <smar@nospam.nospam> wrote in message 
news:OGAElZDzIHA.6096@TK2MSFTNGP06.phx.gbl...
> How would I use LINQ to XML (vb .net) to extract just the text (not the 
> markup) of the following:
>
> <description><![CDATA[<p><a 
> href="http://www.msnbc.msn.com/id/25085248/"><img align="left" border="0" 
> src="http://msnbcmedia.msn.com/j/msnbc/Components/ArtAndPhoto-Fronts/TECH/080610/g-080610-tec-tencent-4x3-10a.thumb.jpg" 
> alt="" style="margin:0 5px 5px 0" /></a>In an interview, David Hajdu, 
> author of the book "The Ten-Cent Plague: The Great Comic-Book Scare and 
> How it Changed America," discusses the rise and fall of  the rise and fall 
> of comics, and how their unwelcome reception compares to today's criticism 
> of violent video games.</p><br clear="all" />]]></description>
>
> In other words, I'm only interested in:
>
> In an interview, David Hajdu, author of the book "The Ten-Cent Plague: The 
> Great Comic-Book Scare and How it Changed America," discusses the rise and 
> fall of  the rise and fall of comics, and how their unwelcome reception 
> compares to today's criticism of violent video games.
>
> Thanks.
>
date: Thu, 12 Jun 2008 08:34:36 +0100   author:   Joe Fawcett am

Re: LINQ to XML   
Scott M. wrote:
> How would I use LINQ to XML (vb .net) to extract just the text (not the 
> markup) of the following:
> 
> <description><![CDATA[<p><a 
> href="http://www.msnbc.msn.com/id/25085248/"><img align="left" border="0" 
> src="http://msnbcmedia.msn.com/j/msnbc/Components/ArtAndPhoto-Fronts/TECH/080610/g-080610-tec-tencent-4x3-10a.thumb.jpg" 
> alt="" style="margin:0 5px 5px 0" /></a>In an interview, David Hajdu, author 
> of the book "The Ten-Cent Plague: The Great Comic-Book Scare and How it 
> Changed America," discusses the rise and fall of  the rise and fall of 
> comics, and how their unwelcome reception compares to today's criticism of 
> violent video games.</p><br clear="all" />]]></description>

Here is how you can do it, using Joe's suggestion:

             XElement desc = XElement.Load(@"file.xml");
             XElement data = XElement.Parse("<data>" + desc.Value + 
"</data>");
             Console.WriteLine(data.Value);




-- 

	Martin Honnen --- MVP XML
	http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 13:19:46 +0200   author:   Martin Honnen

Re: LINQ to XML   
> Here is how you can do it, using Joe's suggestion:
>
>             XElement desc = XElement.Load(@"file.xml");
>             XElement data = XElement.Parse("<data>" + desc.Value + > 
> "</data>");
>             Console.WriteLine(data.Value);

I think this is close, but the XML is not in a file, so I can't use your 
.Load() suggestion to establish the desc variable.  How could I do this with 
XMl that is being brought in from a dynamic source?
date: Thu, 12 Jun 2008 09:57:08 -0400   author:   Scott M. am

Re: LINQ to XML   
> Here is how you can do it, using Joe's suggestion:
>
>             XElement desc = XElement.Load(@"file.xml");
>             XElement data = XElement.Parse("<data>" + desc.Value + 
> "</data>");
>             Console.WriteLine(data.Value);

After tweaking, I got this code running, but it doesn't strip out the 
markup, which is what I want.
date: Thu, 12 Jun 2008 10:00:48 -0400   author:   Scott M. am

Re: LINQ to XML   
Scott M. wrote:
>> Here is how you can do it, using Joe's suggestion:
>>
>>             XElement desc = XElement.Load(@"file.xml");
>>             XElement data = XElement.Parse("<data>" + desc.Value + > 
>> "</data>");
>>             Console.WriteLine(data.Value);
> 
> I think this is close, but the XML is not in a file, so I can't use your 
> .Load() suggestion to establish the desc variable.  How could I do this with 
> XMl that is being brought in from a dynamic source? 

What is a "dynamic source"?

http://msdn.microsoft.com/en-us/library/system.xml.linq.xelement.load.aspx 
also has overloads taking a TextReader or an XmlReader.


If you have a string with XML then use
   XElement desc = XElement.Parse(stringWithXml);

-- 

	Martin Honnen --- MVP XML
	http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 16:06:02 +0200   author:   Martin Honnen

Re: LINQ to XML   
Since when is a CDATA section not a good way to store markup?



"Joe Fawcett" <joefawcett@newsgroup.nospam> wrote in message 
news:%23LehF7FzIHA.2220@TK2MSFTNGP06.phx.gbl...
> Given that the data is in a CDAT section, not a good way to store mark-up, 
> then it is treated as text. To turn it into XML I think you'd have to 
> extract the whole of it and reload it into an XElement/XDocument and then 
> extract the text.
>
>
> -- 
>
> Joe Fawcett (MVP - XML)
>
> http://joe.fawcett.name
>
> "Scott M." <smar@nospam.nospam> wrote in message 
> news:OGAElZDzIHA.6096@TK2MSFTNGP06.phx.gbl...
>> How would I use LINQ to XML (vb .net) to extract just the text (not the 
>> markup) of the following:
>>
>> <description><![CDATA[<p><a 
>> href="http://www.msnbc.msn.com/id/25085248/"><img align="left" border="0" 
>> src="http://msnbcmedia.msn.com/j/msnbc/Components/ArtAndPhoto-Fronts/TECH/080610/g-080610-tec-tencent-4x3-10a.thumb.jpg" 
>> alt="" style="margin:0 5px 5px 0" /></a>In an interview, David Hajdu, 
>> author of the book "The Ten-Cent Plague: The Great Comic-Book Scare and 
>> How it Changed America," discusses the rise and fall of  the rise and 
>> fall of comics, and how their unwelcome reception compares to today's 
>> criticism of violent video games.</p><br clear="all" />]]></description>
>>
>> In other words, I'm only interested in:
>>
>> In an interview, David Hajdu, author of the book "The Ten-Cent Plague: 
>> The Great Comic-Book Scare and How it Changed America," discusses the 
>> rise and fall of  the rise and fall of comics, and how their unwelcome 
>> reception compares to today's criticism of violent video games.
>>
>> Thanks.
>>
>
>
date: Thu, 12 Jun 2008 10:04:08 -0400   author:   Scott M. am

Re: LINQ to XML   
Scott M. wrote:
> Since when is a CDATA section not a good way to store markup?

Using XHTML markup instead of a CDATA section escaping HTML markup gives 
you more flexibility.

-- 

	Martin Honnen --- MVP XML
	http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 16:12:14 +0200   author:   Martin Honnen

Re: LINQ to XML   
Scott M. wrote:
>> Here is how you can do it, using Joe's suggestion:
>>
>>             XElement desc = XElement.Load(@"file.xml");
>>             XElement data = XElement.Parse("<data>" + desc.Value + 
>> "</data>");
>>             Console.WriteLine(data.Value);
> 
> After tweaking, I got this code running, but it doesn't strip out the 
> markup, which is what I want. 

Can you show us your code? My sample works for me as you described, 
outputting the text without markup.

-- 

	Martin Honnen --- MVP XML
	http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 16:13:34 +0200   author:   Martin Honnen

Re: LINQ to XML   
I don't understand your reply.  You're saying that all the XTHML markup 
should be escaped?  For large blocks of markup, this is extremely cumbersome 
and goes against what CDATA was created to do.


"Martin Honnen"  wrote in message 
news:eOYDSZJzIHA.5816@TK2MSFTNGP02.phx.gbl...
> Scott M. wrote:
>> Since when is a CDATA section not a good way to store markup?
>
> Using XHTML markup instead of a CDATA section escaping HTML markup gives 
> you more flexibility.
>
> -- 
>
> Martin Honnen --- MVP XML
> http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 10:22:49 -0400   author:   Scott M. am

Re: LINQ to XML   
First, I tried this:

>>>             Dim desc = item.Element("description").value
>>>             Dim data = XElement.Parse("<data>" + desc.Value + "</data>")
>>>             Console.WriteLine(data.Value)

And that produced the contents of the CDATA section (including markup).

But, now I've got it working as desired using this:

            Dim cd As New XCData(item.Element("description").Value)
            'Now, use System.Xml and the XML DOM to get the text out of the 
CDATA section
            Dim data As New Xml.XmlDocument
            data.LoadXml("<data>" + cd.Value + "</data>")
            Console.WriteLine("     {0}", data.InnerText)

Thanks for your help!




"Martin Honnen"  wrote in message 
news:e8niBaJzIHA.5816@TK2MSFTNGP02.phx.gbl...
> Scott M. wrote:
>>> Here is how you can do it, using Joe's suggestion:
>>>
>>>             XElement desc = XElement.Load(@"file.xml");
>>>             XElement data = XElement.Parse("<data>" + desc.Value + 
>>> "</data>");
>>>             Console.WriteLine(data.Value);
>>
>> After tweaking, I got this code running, but it doesn't strip out the 
>> markup, which is what I want.
>
> Can you show us your code? My sample works for me as you described, 
> outputting the text without markup.
>
> -- 
>
> Martin Honnen --- MVP XML
> http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 10:26:57 -0400   author:   Scott M. am

Re: LINQ to XML   
Scott M. wrote:
> I don't understand your reply.  You're saying that all the XTHML markup 
> should be escaped?  For large blocks of markup, this is extremely cumbersome 
> and goes against what CDATA was created to do.

If you use XHTML then you don't have a need at all to use CDATA sections.

-- 

	Martin Honnen --- MVP XML
	http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 16:29:30 +0200   author:   Martin Honnen

Re: LINQ to XML   
That's not true.  First, this XML is coming from an MSNBC rss feed, so I 
can't change what they are sending.  But, more importantly, CDATA sections 
are for *any* data that you want an XML parser to ignore the markup of.  For 
maximum flexibility, it makes sense to store this data in a CDATA section 
when you don't know what the parser will be.


"Martin Honnen"  wrote in message 
news:u1Ik7iJzIHA.5820@TK2MSFTNGP04.phx.gbl...
> Scott M. wrote:
>> I don't understand your reply.  You're saying that all the XTHML markup 
>> should be escaped?  For large blocks of markup, this is extremely 
>> cumbersome and goes against what CDATA was created to do.
>
> If you use XHTML then you don't have a need at all to use CDATA sections.
>
> -- 
>
> Martin Honnen --- MVP XML
> http://JavaScript.FAQTs.com/
date: Thu, 12 Jun 2008 10:48:07 -0400   author:   Scott M. am

Re: LINQ to XML   
"Scott M." <smar@nospam.nospam> wrote in message 
news:OsuYwtJzIHA.2384@TK2MSFTNGP02.phx.gbl...
> That's not true.  First, this XML is coming from an MSNBC rss feed, so I 
> can't change what they are sending.  But, more importantly, CDATA sections 
> are for *any* data that you want an XML parser to ignore the markup of. 
> For maximum flexibility, it makes sense to store this data in a CDATA 
> section when you don't know what the parser will be.
>
>
Yes, but if you store it as CDATA then you have this sort of problem later 
when you do want it to be mark-up.
I think it's better to store as XHTML, obviously if you are getting this 
from a third-party source that's not always possible.


-- 

Joe Fawcett (MVP - XML)

http://joe.fawcett.name
date: Thu, 12 Jun 2008 16:31:55 +0100   author:   Joe Fawcett am

Re: LINQ to XML   
Well, now that I've found the solution (and it wasn't that complex), I don't 
see how this creates any kind of lasting "problem".


"Joe Fawcett" <joefawcett@newsgroup.nospam> wrote in message 
news:u96ozFKzIHA.2384@TK2MSFTNGP02.phx.gbl...
> "Scott M." <smar@nospam.nospam> wrote in message 
> news:OsuYwtJzIHA.2384@TK2MSFTNGP02.phx.gbl...
>> That's not true.  First, this XML is coming from an MSNBC rss feed, so I 
>> can't change what they are sending.  But, more importantly, CDATA 
>> sections are for *any* data that you want an XML parser to ignore the 
>> markup of. For maximum flexibility, it makes sense to store this data in 
>> a CDATA section when you don't know what the parser will be.
>>
>>
> Yes, but if you store it as CDATA then you have this sort of problem later 
> when you do want it to be mark-up.
> I think it's better to store as XHTML, obviously if you are getting this 
> from a third-party source that's not always possible.
>
>
> -- 
>
> Joe Fawcett (MVP - XML)
>
> http://joe.fawcett.name
>
>
>
date: Thu, 12 Jun 2008 12:09:30 -0400   author:   Scott M. am

Google
 
Web ureader.com


    COPYRIGHT 2007, YARDI TECHNOLOGY LIMITED, ALL RIGHT RESERVE  |   contact us