Jump to content
UBot Underground

Scraping Just The First Matching Item From A Page


Recommended Posts

I'm trying to scrape the first result from a list of results on a page.

 

At the moment I'm scraping the matching attribute into a list and then grabbing the first item from that list.

 

Seems a little redundant. I'm wondering if there is an easy way to just grab the first matching result and ignore the rest?

 

When I scrape the attribute I get a list like this.

 

item1

item2

item3

item4

.

.

item500

item501

 

I'm putting that into a list with $new_line as the delimiter and then using $list_item to get the first item.

 

What would be a better way to do this?

Link to post
Share on other sites

Thanks pash and Code Docta.

 

Yes I've found I have to use something like this

set(#itemName,$list item($list from text($scrape attribute(<innertext=w"*">,"innertext"),$new line),0),"Global")

It would be handy if there were a simple scrap option that only returned the first matching item.

Link to post
Share on other sites

It would be handy if there were a simple scrap option that only returned the first matching item.

 

Pash already pointed you to the function that does that - element offset will return the element at the offset you desire. So if you want the first then the offset would be 0. So your code most likely would be:

set(#itemName,$scrape attribute($element offset(<innertext=w"*">,0),"innertext"),"Global")
Link to post
Share on other sites

Yes that does work. However it is unfortunate the default action doesn't take the first matching result rather than the last matching result.

 

It would also be better if element offset didn't have to be manually edited into the code. If there is more than one matching result, surely an option to specify the offset should automatically appear.

Link to post
Share on other sites

Yes that does work. However it is unfortunate the default action doesn't take the first matching result rather than the last matching result.

 

It would also be better if element offset didn't have to be manually edited into the code. If there is more than one matching result, surely an option to specify the offset should automatically appear.

 

What do you mean by "the default action doesn't take the first matching result rather than the last matching result?"

 

And I see this a lot people want specific answers to solve their problems but it's much better for everyone if they are more general like this because that way they are way more flexible. If Ubot had created more specific implementations then perhaps we wouldn't be able to solve this problem so easily. There won't always be one single function to do the job, that is why functions are stack-able like this so that we can put together these more general implementations to create our own specific ones.

Link to post
Share on other sites

In most use cases the 1st matching result is probably the one that is needed. The last matching result of a table for example is unlikely to be the result needed but this is the default result returned.

Link to post
Share on other sites

In most use cases the 1st matching result is probably the one that is needed. The last matching result of a table for example is unlikely to be the result needed but this is the default result returned.

 

Can you show me an example of what you mean, I almost feel like were talking about two different things.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...