Jump to content
UBot Underground

$10 to the first person to finish this article bot


Recommended Posts

>> $10 Paypal to the first person who can finish the last part of this script!

------------------------------------------------------------------------

 

One one webpage, there are 3 different blocks of content each having a TITLE and an ARTICLE. I've written a bot that will set the KEYWORD, TITLE 1, and ARTICLE 1, but for whatever reason I can't figure out how to properly set TITLE 2 & 3 and ARTICLE 2 & 3.

 

* I've attached a bot which has the first TITLE and ARTICLE successfully working

 

* The sample page that I'm scraping is:

http://www.viligent.com/ubot_sample.htm

 

* Here's a screenshot that shows the layout of the page (also attached):

http://viligent.com/images/ubot-sample.jpg

 

http://viligent.com/images/ubot-sample.jpg

 

Let me know if you have any questions that I can clarify.

Otherwise if you're able to finish this bot, please post it here or send it to me at semjuice @ gmail .com (along with your Paypal so I can pay you :)

 

Thanks in advance!

Link to post
Share on other sites
  • 2 weeks later...

In my opinion, your code to scrape title2 should work and this is a UBot bug. Either that, or I don't understand the implementation of $page_scrape.

 

One time I was so frustrated with $page_scrape, I wrote my own function in javascript. Unfortunately I can't help you using this function, because your page doesn't have javascript on it... and pages without existing javascript don't run inserted javascript (btw - my utility SpeedyBot can get around this).

 

So I can't claim your $20. Still - I attached a bot that would work if you had javascript on the page.

 

I'd also like to see how a UBot expert would scrape this page with $page_scrape...

ArticleScrapeJavascript.ubot

Link to post
Share on other sites

In my opinion, your code to scrape title2 should work and this is a UBot bug. Either that, or I don't understand the implementation of $page_scrape.

 

I agree. I had a mess about with this and you just cannot make that scrape work when it should.

Link to post
Share on other sites

OK I solved it for you :-)

 

Will email you the working bot.

 

Andy

 

 

Awesome Andy! Thanks! I got your email, tested it out and everything worked perfectly (just had to update the location of the file). Let me know if you didn't get the paypal...I sent it about 2 mins ago.

Link to post
Share on other sites

Awesome Andy! Thanks! I got your email, tested it out and everything worked perfectly (just had to update the location of the file). Let me know if you didn't get the paypal...I sent it about 2 mins ago.

 

Excellent. I was meant to change the file location to use the document folder before I sent it to you. Thanks for the paypal, I shall be donating it to a local school.

 

Best wishes,

 

Andy

Link to post
Share on other sites

Do you mind sharing how you did it without javascript?

 

Did you use the page_scrape function?

Yes I used page scrape, some loops and some lists...

 

Basically I used the scrape to get the titles then scraped the whole page, wrote it out to a temp txt file and then read it back into a list so that each line of the page was on a seperate list item. I had title2 and title3 stored in variables so I looped through the new list and recorded the position in the list that title2 and title3 fell at. Finally I built the article2 and 3 variables buy looping through and if the line number was >title2pos and less than title3pos then add the line to article2 variable. If it was greater than title3 pos then add it to article3.

 

I don't know if that makes any sense at all?

 

Andy

Link to post
Share on other sites

Basically I used the scrape to get the titles then scraped the whole page, wrote it out to a temp txt file and then read it back into a list so that each line of the page was on a seperate list item. I had title2 and title3 stored in variables so I looped through the new list and recorded the position in the list that title2 and title3 fell at. Finally I built the article2 and 3 variables buy looping through and if the line number was >title2pos and less than title3pos then add the line to article2 variable. If it was greater than title3 pos then add it to article3.

 

Nice workaround. Never thought to save to a file and parse it line-by-line... I'll keep that in mind if I have to make a scrape-bot without javascript.

 

Thanks for sharing.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...