Jump to content
UBot Underground

How to scrape data from table under a condition on a webpage?


Recommended Posts

I want to scrape the DOMAIN if the status is AVAILABLE from below table.
 

<table>

<tr>

</tr>

.

.

.

 

<tr class="bg2">
<td class="field_domain"><a href="/goto/16/497vua/3/" target="_blank" rel="nofollow" title="lotfor.net">domainname.net</a></td>
<td class="field_length">6</td>
<td class="field_pr"><a href="/goto/11/47nxdz/3/" target="_blank" rel="nofollow" class="sprite spr spr2" title="PageRank 2"><span>2</span></a></td>
<td class="field_domainpop"><a href="/goto/28/499n09/3/" target="_blank" rel="nofollow" title="0">0</a></td>
<td class="field_creationdate"><a href="/goto/8/46i5h0/3/" target="_blank" rel="nofollow" title="Whois Creation Date: 2009-10-02">2009</a></td>
<td class="field_abirth"><a href="/goto/6/46hiog/3/" target="_blank" rel="nofollow" title="First seen 2010-12-29, 2 saved results">2010</a></td>
<td class="field_alexa"><a href="/goto/4/46hfhh/3/" target="_blank" rel="nofollow" title="0">0</a></td>
<td class="field_quantcast"><a href="/goto/22/47th89/3/" target="_blank" rel="nofollow" title="0">0</a></td>
<td class="field_dmoz">-</td>
<td class="field_statuscom"><a href="/goto/18/48p0pk/3/?tld=com" target="_blank" rel="nofollow" class="sprite stlds stld22" title="available"><span>available</span></a></td>
<td class="field_statusnet"><a href="/goto/18/48p0pk/3/?tld=net" target="_blank" rel="nofollow" class="sprite stlds stld22" title="available"><span>available</span></a></td>
<td class="field_statusorg"><a href="/goto/18/48p0pk/3/?tld=org" target="_blank" rel="nofollow" class="sprite stlds stld22" title="available"><span>available</span></a></td>
<td class="field_statusbiz"><a href="/goto/18/48p0pk/3/?tld=biz" target="_blank" rel="nofollow" class="sprite stlds stld22" title="available"><span>available</span></a></td>
<td class="field_statusinfo"><a href="/goto/18/48p0pk/3/?tld=info" target="_blank" rel="nofollow" class="sprite stlds stld21" title="registered"><span>registered</span></a></td>
<td class="field_statusde"><a href="/goto/18/48p0pk/3/?tld=de" target="_blank" rel="nofollow" class="sprite stlds stld22" title="available"><span>available</span></a></td>
<td class="field_searchesglobal">40</td>
<td class="field_competition">2</td>
<td class="field_acpc">0.00 USD</td>
<td class="field_changes">Yesterday 19:44</td>
<td class="field_whois"><a href="/goto/1/44s6w5/3/" target="_blank" rel="nofollow" title="Register now" class="status_free">available</a></td>
<td class="field_relatedlinks"><a class="sprite sicon smenu domainlinks" href="#" id="expirednet-44s6w5" title="Related Links (click here for menu)"><span>RL</span></a></td>
</tr>

.

.

.

<tr>

</tr>

</table>


 
I can easily scrape all domains and store in a list, but have no idea how to do it under condition - only scrape the domain with available status.
 
Any help will be much appreciated.

Link to post
Share on other sites

class="status_free"

 

I have this code but can only scrape the word 'available' in the list. 

add list to list(%domain_list, $scrape attribute(<class="status_free">, "innertext"), "Don\'t Delete", "Global")
Link to post
Share on other sites

obviously you can't scrape domain using class="status_free" tag because domain does not appear in this tag, but you can use it as identifier to see if domain is available and if it is available scrape it from class="field_domain"

Link to post
Share on other sites

what I m doing now is to create 3 lists, one for domain , one for status, i use a loop to remove all non-available domains and put them into the 3th list.

 

but I believe it must have a better way to do it. using if else condition to scrape the domains, but i have no clue how to do it.

 

for example

 

loop {

if class=status_free 

then scrape the domain

}

 

how to tell how many loop will be used?   I don't know unless I store all domains to a list in advanced to get the list total.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...