Chris M 55 Posted June 29, 2014 Report Share Posted June 29, 2014 Hey guys, I scraped a page and need to return only the first result of a dynamic result that changes: <div class="div_flow_charts"> <div id="trust_flow_chart" class="flow_chart_container" style="width: 170px; height: 170px;"></div> <span class="trustCategories "> <div class="category_wrap"> <ul class="category_list"> <li class="cat_line society"> <span class="the_score"><b>17</b></span> <span class="the_percentage" style="display:none;">23.21</span> <span class="the_category">Society / Philosophy</span> </li> <li class="cat_line business"> <span class="the_score"><b>17</b></span> <span class="the_percentage" style="display:none;">20.83</span> <span class="the_category">Business</span> </li> <li class="cat_line computers"> <span class="the_score"><b>16</b></span> <span class="the_percentage" style="display:none;">12.49</span> <span class="the_category">Computers / Internet / Web Design and Development</span> </li> <li class="cat_line regional"> <span class="the_score"><b>15</b></span> <span class="the_percentage" style="display:none;">8.81</span> <span class="the_category">Regional / Asia</span> </li> </ul><ul class="category_list"> <li class="cat_line games"> <span class="the_score"><b>15</b></span> <span class="the_percentage" style="display:none;">7.97</span> <span class="the_category">Games / Video Games / Puzzle</span> </li> <li class="cat_line arts"> <span class="the_score"><b>13</b></span> <span class="the_percentage" style="display:none;">4.06</span> <span class="the_category">Arts / Design</span> </li> <li class="cat_line society"> <span class="the_score"><b>13</b></span> <span class="the_percentage" style="display:none;">3.79</span> <span class="the_category">Society / Politics</span> </li> <li class="cat_line arts"> <span class="the_score"><b>13</b></span> <span class="the_percentage" style="display:none;">3.29</span> <span class="the_category">Arts / Music</span> </li> </ul><ul class="category_list"> <li class="cat_line computers"> <span class="the_score"><b>13</b></span> <span class="the_percentage" style="display:none;">2.52</span> <span class="the_category">Computers / Internet / News and Media</span> </li> <li class="cat_line business"> <span class="the_score"><b>12</b></span> <span class="the_percentage" style="display:none;">2.19</span> <span class="the_category">Business / Business Services</span> </li> <li class="cat_line society"> <span class="the_score"><b>12</b></span> <span class="the_percentage" style="display:none;">1.71</span> <span class="the_category">Society / People</span> </li> <li class="cat_line recreation"> <span class="the_score"><b>12</b></span> <span class="the_percentage" style="display:none;">1.55</span> <span class="the_category">Recreation / Aviation</span> </li> </ul><ul class="category_list"> <li class="cat_line business"> <span class="the_score"><b>11</b></span> <span class="the_percentage" style="display:none;">0.86</span> <span class="the_category">Business / Materials</span> </li> <li class="cat_line business"> <span class="the_score"><b>11</b></span> <span class="the_percentage" style="display:none;">0.7</span> <span class="the_category">Business / Arts and Entertainment</span> </li> <li class="cat_line recreation"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.61</span> <span class="the_category">Recreation / Travel</span> </li> <li class="cat_line home"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.42</span> <span class="the_category">Home / Cooking</span> </li> </ul><ul class="category_list"> <li class="cat_line arts"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.42</span> <span class="the_category">Arts / Entertainment</span> </li> <li class="cat_line computers"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.4</span> <span class="the_category">Computers / Internet / Abuse</span> </li> <li class="cat_line society"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.37</span> <span class="the_category">Society</span> </li> <li class="cat_line society"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.36</span> <span class="the_category">Society / Religion and Spirituality</span> </li> </ul><ul class="category_list"> <li class="cat_line computers"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.35</span> <span class="the_category">Computers / Internet / On the Web</span> </li> <li class="cat_line sports"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.33</span> <span class="the_category">Sports / Hockey</span> </li> <li class="cat_line recreation"> <span class="the_score"><b>10</b></span> <span class="the_percentage" style="display:none;">0.28</span> <span class="the_category">Recreation / Food</span> </li> <li class="cat_line arts"> <span class="the_score"><b>9</b></span> <span class="the_percentage" style="display:none;">0.22</span> <span class="the_category">Arts / Visual Arts</span> </li> </ul><ul class="category_list"> <li class="cat_line computers"> <span class="the_score"><b>9</b></span> <span class="the_percentage" style="display:none;">0.17</span> <span class="the_category">Computers</span> </li> <li class="cat_line computers"> <span class="the_score"><b>9</b></span> <span class="the_percentage" style="display:none;">0.17</span> <span class="the_category">Computers / Internet / Searching</span> </li> <li class="cat_line business"> <span class="the_score"><b>9</b></span> <span class="the_percentage" style="display:none;">0.16</span> <span class="the_category">Business / Marketing and Advertising</span> </li> <li class="cat_line games"> <span class="the_score"><b>9</b></span> <span class="the_percentage" style="display:none;">0.12</span> <span class="the_category">Games / Gambling</span> </li> </ul><ul class="category_list"> <li class="cat_line arts"> <span class="the_score"><b>8</b></span> <span class="the_percentage" style="display:none;">0.08</span> <span class="the_category">Arts / Radio</span> </li> <li class="cat_line sports"> <span class="the_score"><b>8</b></span> <span class="the_percentage" style="display:none;">0.07</span> <span class="the_category">Sports / Events</span> </li> </ul> </div> I'm scraping Majestic Topical Trust flow and the problem is that depending on the site you're researching the result categories change. In this instance I'm after the first result which happens to be this: <li class="cat_line society"> <span class="the_score"><b>17</b></span> <span class="the_percentage" style="display:none;">23.21</span> <span class="the_category">Society / Philosophy</span> </li> The data I need is to scrape the score, the percentage and the category at the bottom. As you can see from the first code box above there are several different classes and so it'shard for me to figure out how to only return the first result and how can I separate the 3 valuesI am after? Any ideas? Quote Link to post Share on other sites
Chris M 55 Posted June 29, 2014 Author Report Share Posted June 29, 2014 I figured it out using regex and $find regex first Quote Link to post Share on other sites
Kreatus (Ubot Ninja) 422 Posted June 29, 2014 Report Share Posted June 29, 2014 I figured it out using regex and $find regex first What plugin have that function? Quote Link to post Share on other sites
HelloInsomnia 1103 Posted June 29, 2014 Report Share Posted June 29, 2014 What plugin have that function? It's under Aymen Data Functions so I think it is either HTTP Post or File Management Quote Link to post Share on other sites
sktan7 12 Posted July 5, 2014 Report Share Posted July 5, 2014 I think that is using add list to list with $find regex. After that take the first item in the list Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.