Jump to content
UBot Underground

Recommended Posts

Hi ladies and gents,

 

I've searched all over for this. Can't seem to figure it out.

 

I have scraped data to a list. However, here's what one list item looks like:

 

(0):
            <a href="/watch?v=2ah30iH8AUA" class="related-video yt-uix-contextlink related-video-featured yt-uix-sessionlink" data-sessionlink="ei=feleUc3QDsmw8gOZzoHAAw&feature=fvwrel&ved=CBQQzRooAA">    <span class="ux-thumb-wrap contains-addto "><span class="video-thumb ux-thumb yt-thumb-default-120 "><span class="yt-thumb-clip"><span class="yt-thumb-clip-inner"><img alt="" data-thumb="//i3.ytimg.com/vi/2ah30iH8AUA/default.jpg" src="//i3.ytimg.com/vi/2ah30iH8AUA/default.jpg" width="120" data-group-key="thumb-group-0"><span class="vertical-align"></span></span></span></span><span class="video-time">2:02</span>


  <button onclick=";return false;" type="button" title="Watch Later" class="addto-button video-actions spf-nolink addto-watch-later-button-sign-in yt-uix-button yt-uix-button-default yt-uix-button-short yt-uix-tooltip" data-video-ids="2ah30iH8AUA" data-button-menu-id="shared-addto-watch-later-login" role="button"><span class="yt-uix-button-content">  <img src="//s.ytimg.com/yts/img/pixel-vfl3z5WfW.gif" alt="Watch Later">
 </span><img class="yt-uix-button-arrow" src="//s.ytimg.com/yts/img/pixel-vfl3z5WfW.gif" alt="" title=""></button>
</span>
<span dir="ltr" class="title" title="How to customize Foobar2000">How to customize Foobar2000</span><span class="stat attribution">by <span class="yt-user-name " dir="ltr">razethew0rld</span></span><span class="stat alt badge"><span class="yt-badge-std">Featured</span></span>        <span class="stat view-count">
        29,572
    </span>

</a>
        

From all of that I want to create a table with 3 columns. href - in this case "/watch?v=2ah30iH8AUA". Title - in this case "How to customize Foobar2000". And view-count - in this case "29,572".

 

How can I do that? I've been scratching my head for 2 days trying to crack it. No luck.

 

Thanks

 

Tim

 

 

Link to post
Share on other sites

I'm not a regex master at all, but that would likely be the route. 

Using $find regular expression v=\w+ will return v=2ah30iH8AUA. From there you'd add the /watch? and put it in a cell. That's about the length of my regex knowledge, if this is data you lifted from a page, its a lot easier just to scrape it while on the page with something like 

set(#href, $page scrape("<a href=\"/watch?v=", " class="), "Global")


set(#title, $page scrape("class=\"title\" title=\"", "\">"), "Global")

etc
Link to post
Share on other sites

Thanks for your reply. It doesn't work because I have more than one video to scrape.


I tried to do it in 3 lists, but the results of each get set in different places, so it's inaccurate.

 

I'll tell you if I figure it out. Otherwise, please share any wisdom :)

Link to post
Share on other sites
  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...