Jump to content
UBot Underground

Rss Feed Scraping


Recommended Posts

Ok im back, sorry to keep bugging you guys, but im really having a hard time getting this.

 

i haven't really been able to anything that i set out to today, and i know it's not the software. i just don't get the concepts or variables and constants and how it all works together just yet. i been watching the tutorials all day.

 

but that's not why i wrote this, i want to know how to scrape or gather rss feeds, i saw one post that said to page scrape and get these <B></B> but i can seem to do that for some reason. i don't even see them when i right click on the page.

 

is there something im missing? and i don't want a bot to do it, i want to actually learn how.

Link to post
Share on other sites

Ok im back, sorry to keep bugging you guys, but im really having a hard time getting this.

 

i haven't really been able to anything that i set out to today, and i know it's not the software. i just don't get the concepts or variables and constants and how it all works together just yet. i been watching the tutorials all day.

 

but that's not why i wrote this, i want to know how to scrape or gather rss feeds, i saw one post that said to page scrape and get these <B></B> but i can seem to do that for some reason. i don't even see them when i right click on the page.

 

is there something im missing? and i don't want a bot to do it, i want to actually learn how.

 

Future reference, when trying to solve a problem, you'll get a significantly better answer if you describe in detail EXACTLY what it is you're trying to do & where you're trying to do it.

 

Tell me (on this thread or via PM if you'd like) what you're trying to scrape/gather & from what URL you're trying to scrape/gather them from, and I'll give you a very detailed explanation on how to do it (so that you learn "how to fish" as opposed to being "given a fish" so to speak).

Link to post
Share on other sites

ok i have group of wordpress blogs, self hosted. all i want to do is grab the /feeds link as if i clicked the orange icon.

 

of course im sure your know i can just click it because it's not seen in the software browser.

 

what i want to do is go to the page, grab the link and save it to a file.

Link to post
Share on other sites

Drag in a save to file command (Variable commands)

Choose the file you'd like to save the links to

don't place anything under content in the parameter window.

Click ok and go to the link you'd like to scrape

Click the empty red space under content in the save to file command

Right click on the link, choose the first div (or second, or third, whichever gives you the best results.)

go down and find the $page scrape command

select it and select the link you want from the page

2.jpg

Click ok, and run the script.

1.jpg

 

 

 

 

 

ok i have group of wordpress blogs, self hosted. all i want to do is grab the /feeds link as if i clicked the orange icon.

 

of course im sure your know i can just click it because it's not seen in the software browser.

 

what i want to do is go to the page, grab the link and save it to a file.

Link to post
Share on other sites

it worked to grab the link, but not the feed. i want to get the website.com/feed link

 

but i have an idea, i just don't know how to execute it.

 

i can scrape a ton of these links from a site that actually lists them, and add "/feed" to the end of each link when i save them of add them to a list with a variable im sure...but i don't know how.

Link to post
Share on other sites

Oh, so you're just trying to save the feed url of the rss page you are on. Just put the url constant in a save file. That will save the link of any page you are on.

 

Add to a list, and then save that to a file when you're all done.

 

it worked to grab the link, but not the feed. i want to get the website.com/feed link

 

but i have an idea, i just don't know how to execute it.

 

i can scrape a ton of these links from a site that actually lists them, and add "/feed" to the end of each link when i save them of add them to a list with a variable im sure...but i don't know how.

Link to post
Share on other sites

no no, sorry i stated it wrong, but thank you i didn't know that's what the url constant did.

 

no what i want to do is scrape some urls on a page they are lined up like this bsaically

 

link1

link2

link3

link4

etc

 

i want to scrape them and at the same time add "/feed" to the end so that they save to the list like this:

 

link1/feed

link2/feed

link3/feed

link4/feed

etc

 

 

how would i do that?

Link to post
Share on other sites

Ok.

 

You could scrape the links first and add it to a list.

 

Then you would create a loop which will run according to the list total of the list containing the scraped urls.

 

You would create a set command with a next list item and a \feed after the next list item constant.

 

Create another add to list inside the loop, and create a new list where the urls with the \feed added will be placed.

 

Add a save to file command at the end, outside of the loop, to save the newly modified urls to a file on your computer.

 

 

I'm attaching the script here: adding to urls.ubot

 

In the add to list at the top, insert the file that contains your urls.

Link to post
Share on other sites

thank you

 

also i tried to use the remove from list but it didn't work

 

i tried to remove a link in a list that i scraped and save modified list to file, but the link i specified (position 0 then i tried position 1)kept passing thru the filter.

Link to post
Share on other sites

thank you

 

also i tried to use the remove from list but it didn't work

 

i tried to remove a link in a list that i scraped and save modified list to file, but the link i specified (position 0 then i tried position 1)kept passing thru the filter.

 

 

remove from list is working fine on my end. Might need to take a look at your script. Maybe a screen shot or something.

Link to post
Share on other sites

Hi lilly, the bot you made for me was awesome and im really starting to get the hang of this whole thing now, this community is pretty cool. but for some reason the bot you made me is doing this:

 

link/feed

link/feed

link/feed

feed

 

what do you think might be making it leave that extra feed on the end like that?

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...