Jump to content
UBot Underground

Scraping Dropdown Content


Recommended Posts

Hello guys,

 

I don't succeed to do what I think must be really simple.

 

On some websites, I found dropdowns with an HTML code like this :

 

<center><select name="id_cat" class="list_cats">
<option value=0>Choose a Category</option>
<option value=0>--------------------------</option>
<option label="Category 1 " value=11>Category 1</option>
<option label="Category 2 " value=5>Category 2</option>
</select>
</center>

 

I "just" want to scrape :

"Category 1 " value=11>Category 1

"Category 2 " value=5>Category 2

 

I tried this :

 

CHOOSE BY ATTRIBUTE

-----InnerHtml

-----<OPTION label=*</OPTION>

-----Wildcards

 

Which is supposed to get the multiple categories as far as i know

 

ADD TO LIST

-----%scraped-categories

-----$scrape chosen attribute

----------innerHTML

 

Which is supposed to put all the scraped categories in a list

 

And it just don't work at all... :blink:

 

Could you please help me, i'm kind of desperate and it MUST be simple i'm sure.

 

Cheers,

Link to post
Share on other sites

First. I have done what you are trying to do. But it was a lot of coding on my side that made it happen AND my solution will not work on your site.

 

Here is where you are going to get tripped up.

 

the "

 

Now here is a potential solution

 

 

Choose by attribute the innerhtml let the following be your Search String also select Wildcards

 

Then do an Add to List and for your content use the $scrape chose attribute using innerhtml

 

That will get you each of these into your list

 

 

Now setup a loop to go through each item in the list and what you want to do is perform a $replace

 

search for "

 

perform another $replace

 

search for "

" and replace hat with a Null/nothing

 

You will have to have enough "Category 1", "Category 2", "Category 3" until "Category x" is met.

 

Remember, I am doing this from memory. Not knowing your website to see the actual dropdown limits me to some degree.

 

But this is the essence in how I achieved my scrape.

Link to post
Share on other sites

Hi thanks for your help !

 

I tried but it still doesn't want to collect the categories (i'm just speaking of unformated categories, formating is another story...)

 

You can have a look at the bot attached, in case it inspires you :rolleyes:

category-scrapping.ubot

 

I also tried to do a pagescrape but it doesn't work either... (I thought it would work with the following : pagescrape LEFT : <option label= RIGHT : </option> but it doesn't scrape anything...)

 

Cheers,

Link to post
Share on other sites

Hmmmm

 

The only way that I can see to do it is to write some Javascript to extract the text between ">" and ""

 

It can be done just not by me. It would take me a few months. LOL

 

I was able to scrape the contents of the dropdown and your code can too. Just change your Search String in your "choose by attribute" node to *>*

 

When you examine the output file you will see all of the options. Now, had the website coder written the

 

For instance, search for "

" and replace it with "".

 

Sorry that my answer does not get you closer to a solution.

Link to post
Share on other sites

I was able to scrape the contents of the dropdown and your code can too. Just change your Search String in your "choose by attribute" node to *>*</OPTION>

 

It's weird, I thought they way i had written my choose by attribute was correct. I'll try yours.

 

Thanks !

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...