A small explanation needed with scrape attribute

Kev · June 4, 2012

Hi ubotters,

From my understanding of ubot, the ability to select an element in order to grab the url of the item in question has always worked for me.

Recently though I'm finding that from the element selection I am only able to grab things like the URL of the image, rather than a link to the page that the image is on. For example go to this page http://vimeo.com/channels/aework/videos

Put a mouse over any of those videos - you can see the path to the page they are on, down at the bottom of your browser. Now do the same in ubot and do a scrape attribute. It will only scrape the image url.

So, when I do a scrape attribute there's no way for me to grab that url path... So, I've taken to looking inside the source code of that page, but honestly, I dont know what Im supposed to be looking for.

So, my question really is this: is there a simple way for me to obtain these paths to the correct pages rather than scraping the image url?

I think I will need to look outside the element selector for this, but it is this bit I have the problem with. What should I be looking for in the source code? Is it something to do with <class>? if so, what?

Sometimes I see <class id = xyz

however when I get a solution from ubotters I often see that the result I get is eomthing like <class= and I cant see that anywhere inside the source code, so I figure that you guys must know something that will make this easier for me.

Looking forward to your help! btw this page http://vimeo.com/channels/aework/videos is a perfect example. The same goes for a pinterest page http://pinterest.com/search/?q=apples (thanks to k1 (Kevin) for helping me with that).

Cheers,

Kev

Kreatus (Ubot Ninja) · June 4, 2012

Here's the code that will work on vimeo. Just paste it on ubot.

add list to list(%channels, $scrape attribute(<href=w"/channels/*/*">, "fullhref"), "Delete", "Global")

Another one using regex. To get only the main channel urls:

add list to list(%channels, $scrape attribute(<href=r"channels/\\w+\\/[0-9]\{4,10\}">, "fullhref"), "Delete", "Global")

You need to expand your selection on scrape attribute to get the details you want..

Kev · June 4, 2012

You need to expand your selection on scrape attribute to get the details you want..

Hi Kreatus,

This is the bit I dont understand. What do you mean by explanding my selection?

Kreatus (Ubot Ninja) · June 4, 2012

Hi Kreatus,

This is the bit I dont understand. What do you mean by explanding my selection?

Hi Kev,

What I mean by that is you need to expand your scrape attribute selection to get more attribute options. Check this screenshot

Sign In

A small explanation needed with scrape attribute

Recommended Posts

Kev 69

Link to post

Share on other sites

Kreatus (Ubot Ninja) 422

Link to post

Share on other sites

Kev 69

Link to post

Share on other sites

Kreatus (Ubot Ninja) 422

Link to post

Share on other sites

Join the conversation

Browse

Activity