user-pic

Using scraper to grab HTML

Vote 0 Votes

Is there an easy way to get the scraper to grab HTML and leave it as HTML, rather than HTML encode it all?

For the MT.org Action stream I've just made, I want to grab the exact contents of the li (with all links) and just output it as is. However if I use TEXT in the scraper i just get text and if I use HTML the stream displays this:

Richard <a href="http://forums.movabletype.org/2008/08/as-docs----mayday-mayday.html#comment-6679"> Answered AS Docs -- Mayday! Mayday! </a> on MT.org

Or do i need to do a callback?

2 Replies

| Add a Reply
  • It'd probably be a lot easier with a callback. That's what I've had to do with Slashdot and Reddit because those sites have some of the most horrific HTML I've ever seen.

    • I was hoping you wouldn't say that!

      Seeing if the feeds can be made a little more useful so I won't have to do a callback (my perl knowledge is pretty much non-existent) and can ditch the scraper.

    • I was hoping you wouldn't say that!

      Seeing if the feeds can be made a little more useful so I won't have to do a callback (my perl knowledge is pretty much non-existent) and can ditch the scraper.

Add a Reply

Forum Groups

151 405

Last Topic: MT Interface Missing by Sherri on Nov 10, 2008

36 144

Last Topic: Installation can't finish by Drazend on Nov 10, 2008

34 93

Last Topic: Creating your own Plug-in by jondauz on Nov 5, 2008

10 33

Last Topic: new licensing confusion by Neil Epstein on Aug 14, 2008

code.sixapart.com

62 226

Last Topic: Callback after blog publishing. by Tomato Interactive on Oct 27, 2008

34 98

Last Topic: Ajax Rating Plugin by kiran on Oct 17, 2008