Biks

October 6, 2020

...and waiting.

June 17, 2020

I'm trying to automatically download .ascm (epub's) files from the Overdrive library network. I just can't click the last button to click to actually START the download.

This script gets you logged into a temporary account # and gets you to the account loan page where I want to download it. When I run it in NODE mode, it works, but not in the sequence. I've also tried moving the mouse over the button (works in NODE mode), not in the sequence. Scraping the URL then downloading doesn't seem to work either. What am I doing wrong?

Note: this is using a temporary library card # - good for 2 weeks starting at June 17th. If other people are monkeying with this particular script, you may have to RETURN the book first to get here. (or just download another ebook)

navigate("https://sails.overdrive.com/account/ozone/sign-in","Wait")
click(<innertext="SAILS Library Patrons">,"Left Click","No")
wait(2)
comment("Note: This # is good for only 2 weeks (June 17)")
type text(<username field>,22043000141061,"Standard")
wait(1)
type text(<password field>,7777,"Standard")
wait(1)
click(<login button>,"Left Click","No")
wait for browser event("DOM Ready","")
wait for browser event("Everything Loaded","")
wait(5)
navigate("https://sails.overdrive.com/sails-wareham/content/media/429609","Wait")
wait for browser event("Everything Loaded","")
wait(1)
click(<aria-label="Borrow An Essay on Satire, Particularly on the Dunciad">,"Left Click","No")
wait(3)
click(<class="button radius secondary contrast u-allCaps borrow-button">,"Left Click","No")
wait for browser event("Everything Loaded","")
wait(3)
navigate("https://sails.overdrive.com/sails-wareham/content/account/loans","Wait")
wait for browser event("Everything Loaded","")
wait(5)
comment("Can\'t click this button HERE")
click(<innertext="Download
Open EPUB ebook">,"Left Click","No")
wait(3)
plugin command("WindowsCommands.dll", "keyboard event", "Enter", "Key Press")
wait for browser event("Everything Loaded","")
wait(2)

The HTML for the button looks like this:

                <div class="main-buttons">
                    
                        
                            <a target="_blank" class="loan-button-nonkindle button radius primary downloadButton" tabindex="0" role="button"
    data-format-id="ebook-epub-open" data-media-id="429609"
    data-format-name="Open EPUB ebook">
        <b>Download</b><br/>
        <span class="dl-text">Open EPUB ebook</span>
</a>
                        
                        <div class="Loans-divider-container">
                            <div class="Loans-divider">
                                
                                <span class="Loans-orText u-allCaps">
                                    or
                                </span>
                                
                            </div>
                        </div>

June 15, 2020

I'm also trying to download a file via a paywall and having the same problem -I can't start a download file, run node works vs running the entire box doesn't. How did you solve it?

I'm trying to download an ebook via Overdrive:

https://bpl.overdrive.com/bpl-visitor/content/media/1438507

Obviously you need a library card and logged in to initiate the download/borrow. I can download the .acsm URL directly - that works but the file doesn't open within Digital Editions. It adds something about my subscription when I click it through the online interface.

I mean we're taking about having Ubot MOVE THE MOUSE over the button and clicking it. Works in node mode, not when running.

Just the weirdest thing. I can get all the other buttons to click. (add to catalog, return book)

May 25, 2019

I'm trying to automate a simple Google alert. Looks like it should be easy, but the standard Ubot choices don't work. You can't drag anything over, and drop menu choices aren't the classic <option value=ITEM>.

Google Alerts creation page: https://www.google.com/alerts (obviously login w a Google account)

The HTML for one of the drop menus looks like this:

<div class="goog-flat-menu-button jfk-select volume_select goog-inline-block" tabindex="0" role="listbox" aria-activedescendant=":6" aria-expanded="false" aria-haspopup="true" style="-webkit-user-select: none;"><div class="goog-inline-block goog-flat-menu-button-caption" id=":6" role="option" aria-setsize="2" aria-posinset="1">Only the best results</div><div class="goog-inline-block goog-flat-menu-button-dropdown" aria-hidden="true"> </div></div> </td> </tr>

Is this a javascript thing?

I REALLY need to automate a bunch of Google alerts - lemme know if you need a few bucks to look at this.

http://i.imgur.com/PPhQgM5.png

May 24, 2019

This should be simple, but I've never done it before.

Let's say I scraped a 300 character long block of text and it's sitting in a variable.

I want to shorten it to the first 200 characters.

(I know how to FIND the first 200 characters, but I want to delete everything outside of my find)

October 10, 2018

I'm not 100% sure about this so don't quote me but I think it installs certain libraries that a lot of programs use and it's pretty likely that you would have already had it installed unless you are working with a fresh OS.

Does Ubot require Microsoft C++ redistributable to run? Then I'm assuming any compiled bots would also require it...

I don't have the Developer Edition of Ubot - all these additional files are also required for compiled bots that go out to others if I DID have the developer edition?

Thanks everyone for helping me out on this.

October 10, 2018

re: Microsoft C++ redistributable

What's the deal on that one? Did I forget that I (probably) installed it years ago?

Not suggested, but mandatory, right? (I'm trying to keep this as simple as possible)

October 9, 2018

Simple question: Do compiled bots need .net Framework to run?

I'm giving someone (a non programmer) one of my compiled bots on Windows 10. He can't get it to run. I'm assuming this is the issue.

(in addition to adding an exclusion to Windows security) Any other tips I should know when giving out bots?

February 22, 2018

I've done this a million times, now I can't. Just trying to scrape emails.

In an earlier version of Ubot, this regex code worked for scraping emails (NODE VIEW):

(\([A-Z0-9._%-])+@([A-Z0-9.-]+)\.([A-Z]{2,4})(\

But in CODE VIEW I see this:

(\\([A-Z0-9._%-])+@([A-Z0-9.-]+)\\.([A-Z]\{2,4\})(\\

It's adding more slashes. What's going on? When someone says USE THIS REGEX code, do I paste it NODE or CODE view?

This regex code is supposed to scrape all variations of emails:

[a-zA-Z0-9\._\-]{3,}(@|AT|\s(at|AT)\s|\s*[\[\(\{]\s*(at|AT)\s*[\]\}\)]\s*)[a-zA-Z]{3,}(\.|DOT|\s(dot|DOT)\s|\s*[\[\(\{]\s*(dot|DOT)\s*[\]\}\)]\s^*)[a-zA-Z]{2,}((\.|DOT|\s(dot|DOT)\s|\s*[\[\(\{]\s*(dot|DOT)\s*[\]\}\)]\s*)[a-zA-Z]{2,})?$

It doesn't work for me when I paste it in NODE view. When I paste it in CODE view, it says I have errors on char 65, 67, 71.

What does an email capture regex code look like before I paste it into NODE view, and what will it look like in CODE view?

** 60 MINUTES LATER **

OK, this works: http://www.rubular.com/r/nidQpOizwC

(\w+(\s|)@(\s|)[a-zA-Z_]+?\.[a-zA-Z]{2,3})

But that won't get any [at] or {AT} versions. (as in the long one above - anyone have that?)

February 6, 2018

So what other software can I learn/use to do this? (that won't crash)

or

I would really love to have this scraped: https://soundcloud.com/harperaudio_us/followers

And maybe this too: https://soundcloud.com/audible/followers

Anyone willing to run my code on their machine? Does anyone know of anyone who could/would do this? How much do you/they need?

February 6, 2018

clear list(%followers)
navigate("https://soundcloud.com/random-house-audio/followers","Wait")
wait for browser event("Everything Loaded","")
loop(9999) {
    add list to list(%followers,$scrape attribute(<class="userBadgeListItem__heading sc-type-small sc-link-dark sc-truncate">,"href"),"Delete","Global")
    run javascript("window.setTimeout(function() \{
window.scrollTo(0, document.body.scrollHeight)
 \}, 500)")
    wait(3)
}
save to file("C:\\Users\\Public\\Ubot\\Soundcloud\\SCRAPED USERS.txt",%followers)

I'm basically do this. Each javascript page load gives me 25 new profiles at the end of the column. Once I hit 42,000+ (aprox loop #1680) - end of game, the system locks up.

What I would like to do is save, then somehow delete everything from memory up to 42,000, then continue - but I can't, it's one single long page of results. From what I remember, Twitter does the same thing. What I've done is move the SAVE TO FILE command back on each loop, so I catch everything before the crash. But I'm still stuck at 42,000.

Elsewhere, I HAVE scraped long sequences where it loads page 1, 2, 3 etc. I save a bunch, create a new browser and reset, then continue. Not here.

February 5, 2018

Have you tried using this plugin? http://network.ubotstudio.com/forum/index.php/topic/16308-free-plugin-large-data/
You can also take a look at this plugin as well: http://network.ubotstudio.com/forum/index.php/topic/13088-ubot-xml-plugin-ubot-discount/

From what I can see, these deal with data once you've acquired it. The problem is I need to hold 2.5 million entries in memory (before the scrape) before I can do anything with it. I can't parse the INPUT into manageable smaller sections.

Giganut, how many Twitter followers can you scrape at one time?

February 5, 2018

I have never gotten Ubot to scrape beyond a certain point. It seems once I hit around 42,000 entries, the whole thing collapses. I just had this happen twice on the same site. I'm guessing I'm running out of memory. At this point I'm using 16 GIG, will doubling my memory help?

I've recently been grabbing followers on a few websites that require you keep loading a new batch of users as you scroll down the page. (using the Javascript load command) There's no way of stopping, saving and continuing beyond a certain point, it's just offering me an endless list.

As an example: The Spotify Twitter account has 2.5 million followers. How the hell would I scrape 2.5 million entries with Ubot? Any other places/services that could do this?

February 2, 2018

Ver 1.1.4.4 installed. Why can't I find the clipboard functions? Searching for CLIPBOARD brings up nothing.

January 17, 2018

I had the same problem - that seemed to do the trick. (chrome 21)

November 27, 2017

I was banging my head against Linkedin today. I could scrape and make lists of people, but when I went directly to their profile page, I couldn't scrape ANYTHING. (after changing user agents, turning things off, etc)

Apparently Linkedin takes exception to doing this: https://techcrunch.com/2016/08/15/linkedin-sues-scrapers/

Again, it's the profile pages that they're protecting. Anyone have any success with this?

August 29, 2017

For the "send an email" bit - I don't even see this in a browser, maybe you have to be logged in?

Ya see? That's what I'm dealing with - it doesn't exist/display within Ubot. I captured the above image not logged in with Firefox.

August 28, 2017

I'm running version 5.9.50. I had this issue years ago, then forgot about it. I couldn't grab the emails. Note: I can SEE the social icons, I just can't click on them.

August 28, 2017

I'm not a hard core programmer when it comes to Ubot, so I could use some help on what is going on with Amazon.

I'm trying to scrape emails (plus social media links) from Amazon profiles. I can see them in a regular browser, they don't even show up within Ubot.

Example Profile: https://www.amazon.com/gp/profile/amzn1.account.AGLZ3EQ5DDQQW33D64E33RTXYMHA

http://i.imgur.com/0f7V5rD.png

This is a capture from Firefox. SEND AN EMAIL doesn't even appear within Ubot. The icons don't link either. Am I dealing with an iframe on the page? What is going on here? How does one scrape this, or did Amazon kick our butts? :-)

January 15, 2017

OK, I finally manged to get a 2captcha account with credits. My script looks correct, I see somebody selecting the images, but they only seem to do one pass at it. Google refreshes the list and wants you to select multiple times.

navigate("https://www.youtube.com/channel/UC5WAKQB3fOMPmM6DhDjF5Eg/about","Wait")
wait for browser event("Everything Loaded","")
wait(2)
click(<class="yt-uix-button yt-uix-button-size-small yt-uix-button-default business-email-button">,"Left Click","No")
wait(3)
click(<class="yt-uix-button yt-uix-button-size-small yt-uix-button-default business-email-button">,"Left Click","No")
wait(3)
click(<class="recaptcha-checkbox-spinner">,"Left Click","No")
wait(5)
solve click captcha(<id="rc-imageselect">)

Am I missing something? Who is supposed to hit the verify button? Am I doing something wrong or do the 2captcha workers don't have a clue that they're supposed to keep going?

January 13, 2017

OK cool! Thanks!

January 13, 2017

..if any. I'm looking at this on Youtube where I want to reveal an email address in the ABOUT tab:

http://i.imgur.com/HdRzY9r.png

Is there any service or method that can deal with this? (I've been away from Ubot for a while)

October 10, 2016

Yea, I was wondering the same thing. Bad Gateway message. Now what do we do?

Is there a webpage somewhere that gives the latest status? Where Seth says, "doing server maintenance, keep your panties on and cool yer jets"?

March 27, 2016

I just tried uploading a photo to Twitter, nada. Was going to post this same type of thread.

If I had the only bot that could upload pics to twitter, I would not share that info either. In my journey thread I might.

So...has ANYONE made a ubot that can upload an image to Twitter? (note: the mobile version doesn't give you an option to upload an image) Can I buy someone else's bot that can do this? Is this method some deep dark secret that nobody is willing to disclose? Has anyone taken a crack at it? What's the difficulty?

Right now I have a bot that posts a text tweet and follows 4,000 users via 15 accounts. Takes 7 hours to complete, never crashes, works beautifully. I was hoping to could get some images going and improve my retweet rate.

March 14, 2016

OK, I'm not a programmer. When I first saw headless browsing advertised, I had no idea what it was.

At this point I understand what it's doing - going out on the web without rendering the entire page. I also understand that it's faster. But what exactly can I do or not do with it?

One of the thing I only recently discovered is using Scrapebox to actually SCRAPE things - I always used it as a spam bot. I'm now able to plow through hundreds of websites, scraping emails in seconds with Scrapebox and a few proxies. I'm guessing that's what Scrapebox uses - a headless browser and just grabs the basic HTML to search.

Can I use headless browsing to log into an account and do something? Let's say I want to follow a bunch of people on Twitter. Does headless browsing disrupt the log in process? What can't I do with headless browsing?

Sign In

Biks

Content Count

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Posts posted by Biks

This Browser Or This App May Not Be Secure. Try Using A Different Browser

Starting A Download From Overdrive.com

Save As Dialog Not Opening

Google Alerts Drop Down Menus

Shorten Long Text Blocks

Compiled Bots And .net Framework

Compiled Bots And .net Framework

Compiled Bots And .net Framework

Email Regex Different In Node Vs Code View

Scraping A Crazy Amount Of Data

Scraping A Crazy Amount Of Data

Scraping A Crazy Amount Of Data

Scraping A Crazy Amount Of Data

[Free] Heopas Custom Plugin (Thread Lock / Sqlite / Thread Variables / Email / Ini / Clipboard)

Browser Click Command Freezing [Spotify Login]

Unable To Interact With Linkedin Page

Scraping Amazon - What's The Deal?

Scraping Amazon - What's The Deal?

Scraping Amazon - What's The Deal?

Match Image Captchas - What Are My Options?

Match Image Captchas - What Are My Options?

Match Image Captchas - What Are My Options?

Unable To Connect To License Server Ongoing Problem

Trouble Uploading Pictures. Twitter Facebook Backpages

Headless Browsing Limits?

Browse

Activity