Jump to content
UBot Underground

Biks

Fellow UBotter
  • Content Count

    217
  • Joined

  • Last visited

  • Days Won

    1

Posts posted by Biks

  1. I'm trying to automatically download .ascm (epub's) files from the Overdrive library network. I just can't click the last button to click to actually START the download.

     

    is5GkSh.png

    This script gets you logged into a temporary account # and gets you to the account loan page where I want to download it. When I run it in NODE mode, it works, but not in the sequence. I've also tried moving the mouse over the button (works in NODE mode), not in the sequence. Scraping the URL then downloading doesn't seem to work either. What am I doing wrong?

     

    Note: this is using a temporary library card # - good for 2 weeks starting at June 17th. If other people are monkeying with this particular script, you may have to RETURN the book first to get here. (or just download another ebook)

    navigate("https://sails.overdrive.com/account/ozone/sign-in","Wait")
    click(<innertext="SAILS Library Patrons">,"Left Click","No")
    wait(2)
    comment("Note: This # is good for only 2 weeks (June 17)")
    type text(<username field>,22043000141061,"Standard")
    wait(1)
    type text(<password field>,7777,"Standard")
    wait(1)
    click(<login button>,"Left Click","No")
    wait for browser event("DOM Ready","")
    wait for browser event("Everything Loaded","")
    wait(5)
    navigate("https://sails.overdrive.com/sails-wareham/content/media/429609","Wait")
    wait for browser event("Everything Loaded","")
    wait(1)
    click(<aria-label="Borrow An Essay on Satire, Particularly on the Dunciad">,"Left Click","No")
    wait(3)
    click(<class="button radius secondary contrast u-allCaps borrow-button">,"Left Click","No")
    wait for browser event("Everything Loaded","")
    wait(3)
    navigate("https://sails.overdrive.com/sails-wareham/content/account/loans","Wait")
    wait for browser event("Everything Loaded","")
    wait(5)
    comment("Can\'t click this button HERE")
    click(<innertext="Download
    Open EPUB ebook">,"Left Click","No")
    wait(3)
    plugin command("WindowsCommands.dll", "keyboard event", "Enter", "Key Press")
    wait for browser event("Everything Loaded","")
    wait(2)
    

    The HTML for the button looks like this:

                    <div class="main-buttons">
                        
                            
                                <a target="_blank" class="loan-button-nonkindle button radius primary downloadButton" tabindex="0" role="button"
        data-format-id="ebook-epub-open" data-media-id="429609"
        data-format-name="Open EPUB ebook">
            <b>Download</b><br/>
            <span class="dl-text">Open EPUB ebook</span>
    </a>
                            
                            <div class="Loans-divider-container">
                                <div class="Loans-divider">
                                    
                                    <span class="Loans-orText u-allCaps">
                                        or
                                    </span>
                                    
                                </div>
                            </div>
    
  2. I'm also trying to download a file via a paywall and having the same problem -I can't start a download file, run node works vs running the entire box doesn't. How did you solve it?

     

    I'm trying to download an ebook via Overdrive: 

     

    https://bpl.overdrive.com/bpl-visitor/content/media/1438507

     

    Obviously you need a library card and logged in to initiate the download/borrow.  I can download the .acsm URL directly - that works but the file doesn't open within Digital Editions. It adds something about my subscription when I click it through the online interface.

     

    I mean we're taking about having Ubot MOVE THE MOUSE over the button and clicking it. Works in node mode, not when running.

     

    Just the weirdest thing. I can get all the other buttons to click. (add to catalog, return book)

  3. I'm trying to automate a simple Google alert. Looks like it should be easy, but the standard Ubot choices don't work. You can't drag anything over, and drop menu choices aren't the classic <option value=ITEM>.

     

    Google Alerts creation page: https://www.google.com/alerts (obviously login w a Google account)

     

    The HTML for one of the drop menus looks like this:

     

    <div class="goog-flat-menu-button jfk-select volume_select goog-inline-block" tabindex="0" role="listbox" aria-activedescendant=":6" aria-expanded="false" aria-haspopup="true" style="-webkit-user-select: none;"><div class="goog-inline-block goog-flat-menu-button-caption" id=":6" role="option" aria-setsize="2" aria-posinset="1">Only the best results</div><div class="goog-inline-block goog-flat-menu-button-dropdown" aria-hidden="true"> </div></div>  </td> </tr> 

     

    Is this a javascript thing?

     

    I REALLY need to automate a bunch of Google alerts - lemme know if you need a few bucks to look at this.

     

    http://i.imgur.com/PPhQgM5.png

     

     

  4.  

    I'm not 100% sure about this so don't quote me but I think it installs certain libraries that a lot of programs use and it's pretty likely that you would have already had it installed unless you are working with a fresh OS.

     

    Does Ubot require Microsoft C++ redistributable to run? Then I'm assuming any compiled bots would also require it...

     

    I don't have the Developer Edition of Ubot - all these additional files are also required for compiled bots that go out to others if I DID have the developer edition?

     

    Thanks everyone for helping me out on this.

  5. Simple question: Do compiled bots need .net Framework to run?

     

    I'm giving someone (a non programmer) one of my compiled bots on Windows 10. He can't get it to run. I'm assuming this is the issue.

     

    (in addition to adding an exclusion to Windows security) Any other tips I should know when giving out bots?

  6. I've done this a million times, now I can't. Just trying to scrape emails.  :P

     

    In an earlier version of Ubot, this regex code worked for scraping emails (NODE VIEW):

    (\([A-Z0-9._%-])+@([A-Z0-9.-]+)\.([A-Z]{2,4})(\
    

    But in CODE VIEW I see this:

    (\\([A-Z0-9._%-])+@([A-Z0-9.-]+)\\.([A-Z]\{2,4\})(\\
    

    It's adding more slashes. What's going on? When someone says USE THIS REGEX code, do I paste it NODE or CODE view?

     

    This regex code is supposed to scrape all variations of emails:

    [a-zA-Z0-9\._\-]{3,}(@|AT|\s(at|AT)\s|\s*[\[\(\{]\s*(at|AT)\s*[\]\}\)]\s*)[a-zA-Z]{3,}(\.|DOT|\s(dot|DOT)\s|\s*[\[\(\{]\s*(dot|DOT)\s*[\]\}\)]\s^*)[a-zA-Z]{2,}((\.|DOT|\s(dot|DOT)\s|\s*[\[\(\{]\s*(dot|DOT)\s*[\]\}\)]\s*)[a-zA-Z]{2,})?$
    

    It doesn't work for me when I paste it in NODE view. When I paste it in CODE view, it says I have errors on char 65, 67, 71.

     

    What does an email capture regex code look like before I paste it into NODE view, and what will it look like in CODE view?

     

    ** 60 MINUTES LATER **

     

    OK, this works: http://www.rubular.com/r/nidQpOizwC

    (\w+(\s|)@(\s|)[a-zA-Z_]+?\.[a-zA-Z]{2,3})
    

    But that won't get any [at] or {AT} versions. (as in the long one above - anyone have that?)

  7. clear list(%followers)
    navigate("https://soundcloud.com/random-house-audio/followers","Wait")
    wait for browser event("Everything Loaded","")
    loop(9999) {
        add list to list(%followers,$scrape attribute(<class="userBadgeListItem__heading sc-type-small sc-link-dark sc-truncate">,"href"),"Delete","Global")
        run javascript("window.setTimeout(function() \{
    window.scrollTo(0, document.body.scrollHeight)
     \}, 500)")
        wait(3)
    }
    save to file("C:\\Users\\Public\\Ubot\\Soundcloud\\SCRAPED USERS.txt",%followers)
    

    I'm basically do this. Each javascript page load gives me 25 new profiles at the end of the column. Once I hit 42,000+ (aprox loop #1680) - end of game, the system locks up.

     

    What I would like to do is save, then somehow delete everything from memory up to 42,000, then continue - but I can't, it's one single long page of results. From what I remember, Twitter does the same thing. What I've done is move the SAVE TO FILE command back on each loop, so I catch everything before the crash. But I'm still stuck at 42,000.

     

    Elsewhere, I HAVE scraped long sequences where it loads page 1, 2, 3 etc. I save a bunch, create a new browser and reset, then continue. Not here.

  8.  

    From what I can see, these deal with data once you've acquired it. The problem is I need to hold 2.5 million entries in memory (before the scrape) before I can do anything with it. I can't parse the INPUT into manageable smaller sections.

     

    Giganut, how many Twitter followers can you scrape at one time?

  9. I have never gotten Ubot to scrape beyond a certain point. It seems once I hit around 42,000 entries, the whole thing collapses. I just had this happen twice on the same site. I'm guessing I'm running out of memory. At this point I'm using 16 GIG, will doubling my memory help?

     

    I've recently been grabbing followers on a few websites that require you keep loading a new batch of users as you scroll down the page. (using the Javascript load command) There's no way of stopping, saving and continuing beyond a certain point, it's just offering me an endless list.

     

    As an example: The Spotify Twitter account has 2.5 million followers. How the hell would I scrape 2.5 million entries with Ubot? Any other places/services that could do this?

  10. I was banging my head against Linkedin today. I could scrape and make lists of people, but when I went directly to their profile page, I couldn't scrape ANYTHING. (after changing user agents, turning things off, etc)

     

    Apparently Linkedin takes exception to doing this: https://techcrunch.com/2016/08/15/linkedin-sues-scrapers/

     

    Again, it's the profile pages that they're protecting. Anyone have any success with this?

  11. I'm not a hard core programmer when it comes to Ubot, so I could use some help on what is going on with Amazon.

     

    I'm trying to scrape emails (plus social media links) from Amazon profiles. I can see them in a regular browser, they don't even show up within Ubot.

     

    Example Profile: https://www.amazon.com/gp/profile/amzn1.account.AGLZ3EQ5DDQQW33D64E33RTXYMHA

     

    http://i.imgur.com/0f7V5rD.png

     

    This is a capture from Firefox. SEND AN EMAIL doesn't even appear within Ubot. The icons don't link either. Am I dealing with an iframe on the page? What is going on here? How does one scrape this, or did Amazon kick our butts? :-)

     

  12. OK, I finally manged to get a 2captcha account with credits. My script looks correct, I see somebody selecting the images, but they only seem to do one pass at it. Google refreshes the list and wants you to select multiple times.

    navigate("https://www.youtube.com/channel/UC5WAKQB3fOMPmM6DhDjF5Eg/about","Wait")
    wait for browser event("Everything Loaded","")
    wait(2)
    click(<class="yt-uix-button yt-uix-button-size-small yt-uix-button-default business-email-button">,"Left Click","No")
    wait(3)
    click(<class="yt-uix-button yt-uix-button-size-small yt-uix-button-default business-email-button">,"Left Click","No")
    wait(3)
    click(<class="recaptcha-checkbox-spinner">,"Left Click","No")
    wait(5)
    solve click captcha(<id="rc-imageselect">)
    

    Am I missing something? Who is supposed to hit the verify button? Am I doing something wrong or do the 2captcha workers don't have a clue that they're supposed to keep going?

  13. I just tried uploading a photo to Twitter, nada. Was going to post this same type of thread.

     

     

    If I had the only bot that could upload pics to twitter, I would not share that info either. In my journey thread I might.

     

    So...has ANYONE made a ubot that can upload an image to Twitter? (note: the mobile version doesn't give you an option to upload an image) Can I buy someone else's bot that can do this? Is this method some deep dark secret that nobody is willing to disclose? Has anyone taken a crack at it? What's the difficulty?

     

    Right now I have a bot that posts a text tweet and follows 4,000 users via 15 accounts. Takes 7 hours to complete, never crashes, works beautifully. I was hoping to could get some images going and improve my retweet rate.

  14. OK, I'm not a programmer. When I first saw headless browsing advertised, I had no idea what it was.

     

    At this point I understand what it's doing - going out on the web without rendering the entire page. I also understand that it's faster. But what exactly can I do or not do with it?

     

    One of the thing I only recently discovered is using Scrapebox to actually SCRAPE things - I always used it as a spam bot. I'm now able to plow through hundreds of websites, scraping emails in seconds with Scrapebox and a few proxies. I'm guessing that's what Scrapebox uses - a headless browser and just grabs the basic HTML to search.

     

    Can I use headless browsing to log into an account and do something? Let's say I want to follow a bunch of people on Twitter. Does headless browsing disrupt the log in process? What can't I do with headless browsing?

×
×
  • Create New...