Jump to content
UBot Underground

Sebastian Rooks

Members
  • Content Count

    40
  • Joined

  • Last visited

  • Days Won

    1

Posts posted by Sebastian Rooks

  1. Hi Nick, I'm using the latest version of Ubot, Ive gone from Chrome 21 to 39 to back to 21 again, and a myriad of user agent settings as well. I still couldn't change a social media profile or header picture to save my life.

    I don't know if it's me, the browser, or Ubot.

     

    So I came here looking for some clarity. I've got to say I find it a bit odd that the most recent discussion of changing a profile picture on twitter is a few years old. Really? Does that mean that it's not a problem for anyone other than me? Or that no one wants to automate social media account creation? Or I'm the first guy in the whole world to think of using Ubot to make social media accounts, and the world is mine because I'm the first to the table?

     

    I heard about Exbrowser, I asked a couple of questions on the sales thread (that didn't get answered) I think a day or two before I posted this. Then, literally 5 minutes after I started this topic, my harddrive crashed and I ended up buying an unplanned new laptop. Those Exbrowser guys probably should've taken my money while I still had some.

  2. I'm having trouble uploading pictures on Twitter, Facebook, Backpages, (and probably everywhere else, but that's what I've been trying lately)

     

    I've tried all different user agents. The basic pattern of my experience is as follows:

     

    I step through a website in Ubot browser while setting up the automation steps. I get to a point, normally involving uploading a picture....

    sometimes I can get the command to click the button. Once. Maybe. But not always. Any further interaction with the website turns my mouse cursor into a pointing finger, and then it stays that way for what is usually forever. Usually after 30 to 60 seconds the cursor turns back to the usual mouse pointer over the code side of the ubot window, but remains a pointed finger over the browser.

    from this point on, nothing really works right. Sometimes I can navigate away, other times I can't.

     

    After the first interaction, everything goes basically unresponsive. And so it goes, again and again, with everything I work on.

    So much so that I have so far failed to fully automate a single website. This is getting old and I want an understanding and a solution.

     

    I gather that working with Facebook and Twitter menus can be difficult, that's what took me to Backpages today, because I can't find a simpler looking site.

    I can't get the "choose images" button to work.

     

    What good is the rest of the work I've done on this site (or the others) if it won't upload an image? 

     

    I've tried different proxies, and no proxy. Eventually I close Ubot down and start over. The cycle never really ends.

     

    Again, why does trying to click on one of those buttons freeze the browser, and for the most part, ubot itself.

    Most importantly, what can I do to correct it?

     

    Sincerely, 

     

    Your frustrated, pissed off and unproductive friend,

     

    Sebastian Rooks

     

    Now I can't even get element in the browser to click with my actual mouse, and I just restarted Ubot. W in the actual F?

     

    I'm trying to select this button, effectively, consistently, and without freezing the browser and Ubot. Is that a lot to ask for?

    <div id="imageEditModule">
          <span class="editAdTitles"><b>Add Images</b></span><br>
          <span class="editAdText">Maximum 12 images, max size 10mb each.</span>
          <div id="editAdPhotoLayout">
            
          </div><!-- #imageEditLayout -->
          <div id="imageUploadProgress" style="display:none;">
            <span class="uploadProgressHeader">Image <span></span></span>: <span class="uploadProgressCurrent">0%</span>
          </div>
          
            <div id="addImageLinkCont" class="addImageItem">
          
            
              <input type="button" id="addImageLink" value="Choose Images">
            
            <input id="addImageInput" type="file" name="image" size="40" multiple="" accept="image/*" style="display:none;">
          </div>     
        </div><!-- .imageEditModule -->
    
  3. Hey Dan,

     

    I'm having trouble with not being able to do things like change pictures and profile information in Twitter and Facebook from within the Ubot Browser. It seems that I can't get it to interact consistently. It's becoming a real problem.

     

    Will this plugin overcome this? Reliably?

     

    How hard is it to learn? I don't know anything about Xpath, and little about Regex.

     

    I also want to be able to store session cookies for those sites, which I understand to be httponly. So that I can make multi threaded account managers.

     

    Is this what I need?

     

    Thanks! 

  4. I'm on a define kick myself. Or rather they're kicking my.....

    I've got a bot using twenty something includes. I thought I was going to be streamlined and efficient, but I don't think I've made any honest forward progress in the last 12 straight hours of effort.

     

    A lot of my defines have other defines included with them. I've been crashing all day, and I think I'm almost to the bottom of it?

     

    Can you not use a define to call another define? That seems to be what's doing me in right now. Emphasis on "seems" because I usually end up being wrong.

     

    Anyway, thanks for the tutorial

  5. Thank you guys for your help!

     

    My problem ended up being that I made erroneous assumptions about what my problem was. I couldn't get my defines to work, I assumed it was a problem with parameters, because that's the part I though I didn't understand.

     

    The problem in my case (in case any newcomer makes a mistake like this) is that because I worked backwards...creating unruly monster bots and then trying to clean them up by exporting defines to a new .ubot file I have to change every occurrence of a custom command to reflect it's new path.

    (completely obvious to most of you, I'm sure) But I didn't realize it because the nodes look the same even though the code changes.

     

    So, for example:

      random wait time(2, 5)

     

     becomes

     

         run command("define - random wait time -  find table cell - add data to table","random wait time",2, 5)

     

    Looks like I've got a long day ahead of me...

  6. Thank you Pash. That's a great answer, and it should be enough.

    My problem is that I've already read those, watched that excellent video, and worked through TJ's example. I can make that work, but I can't quite apply it to my own scripts.

     

    Also, most of my defines involve lists and tables. Surely that's part of my problem.

    Here's an example of the shortest, cleanest define that I haven't yet make work via include (but it works fine when it's in the bot together) I feel like if someone can help me figure out how to get this one running, I can apply those lessons to the bigger ones that I have.

    define count hashtag duplications(#URLS Count) {
        set(#URLS Count,#URLS Count,"Global")
        clear list(%hashtags Original)
        add list to list(%hashtags Original,%hashtags,"Don\'t Delete","Global")
        clear list(%hashtags Unique)
        add list to list(%hashtags Unique,%hashtags Original,"Delete","Global")
        clear table(&Hashtags Count)
        set(#URLS Count,0,"Global")
        loop($list total(%hashtags Unique)) {
            set table cell(&hashtags Count,#URLS Count,0,$COUNT DUPLICATES($list item(%hashtags Unique,#URLS Count)))
            set table cell(&hashtags Count,#URLS Count,1,$list item(%hashtags Unique,#URLS Count))
            increment(#URLS Count)
        }
    }
    comment("used by hashtag searcher")
    define $COUNT DUPLICATES(#DEF URL) {
        set(#URL Count,0,"Global")
        set(#URLS Original,0,"Global")
        loop($list total(%hashtags Original)) {
            if($comparison($list item(%hashtags Original,#URLS Original),"=",#DEF URL)) {
                then {
                    increment(#URL Count)
                }
                else {
                }
            }
            increment(#URLS Original)
        }
        return(#URL Count)
    }
    
    

    I found this somewhere on this forum, and works perfectly for my needs. (I wish I could remember who to credit). It's also the shortest and cleanest example of what I don't understand how to make work via include.

     

    If someone can explain this, it'd give me a pretty big boost. I've learned a lot if my first two months, and I feel like learning to effectively incorporate defines into my bots will allow me to de-spaghetify my sprawling code and take me to another level. So I'll spend as much time and energy trying to understand as it takes.

     

    Thanks again, everyone.

  7. I've got 7 script tabs in one wildly unruly bot, with 7 different unruly interfaces. I want to bring it all together neatly and efficiently, and then conquer the internet.

     

    I gather that using defines, along with include, is the way to achieve this. 

     

    My problem is that I don't understand what parameters are and or how they work in regards to this situation. I've been googling for the last couple of hours, and I'm coming up a little short.

     

    I've broken my code down into Defines, I can call the Defines as long as they're in the same bot. But trying to include them in another bot doesn't work. It seems that anything that uses variables (most of my defines have several which will be set from the ui) doesn't work like this.

     

    As always, any guidance will be appreciated.

     

    Your buddy,

    Sebastian Rooks

  8. I know this thread is an antique, but you know, I have got 2 new computers since having this issue, and it never did work, and still doesn't on my new computers either. I cant quite understand how ANYONE is getting this to work, or why this would be so difficult... I can get to it just fine with filezilla, but ubot is a no-go even after all these years!

     

    Hey I know this is from forever ago, and your problem was forever before that, but I'm dealing with the same error. Did you ever get it resolved?

    Thanks!

     

    I'm able to connect to my FTP server through ftp.exe, but ubot gives me that instant error.

  9. From my experience it depends on the website you are working with. With some websites the build in browser is not able to handle everything properly. 

    Even the new one has some limitations on sites with a lot of active content, scripts and stuff.

     

    One of the reasons why I'm using Google Chrome in all my bots now. Much more stable. A bit more work to build the expressions, because there is no easy element selector. 

    But you can copy the necessary Xpath Expressions very easily. And yeah.. you need ExBrowser Plugin for that.

     

    If you want to test it, please let me know. 

     

    Dan

     

    So far my problems (and my experience) have been limited to Twitter, and today, trying to scrape the keyword ideas from Google's keywords planner because up until about 6 minutes ago, I was apparently too stupid to just download the .csv and get it to open properly.

     

    I've read good things about the Xpath plugin, and I'm definitely interested. But I'm still fairly new to everything, and I'm afraid Xpath is still outside of my understanding. 

  10. I do actually have a couple of instances of chrome open. I'm not sure if I did all the other times or not, but it's quite possible.

     

    That's something new to try that I never would have guessed, kind of like deselecting that checkbox I didn't know about. ;)

     

    Thanks man, this has been an ongoing frustration for the past couple of months. I'm feeling hopeful. 

  11. Thanks, it' nice to know that I'm not alone here.

     

    Unfortunately, on my end, there doesn't seem to be a correlation between bot size, use of defines, and advanced element selector crashing.

    It happened many times working on a big messy bot, and it's happened I think 5 times today on a really small bot, with no defines.

     

    ehh, such is life.

  12. Hi,

     

    I see that this thread is a few months old now, but I'm here because I'm having what, I think, is a similar problem.

     

     

    Using the advanced element selector has crashed Ubot for me....somewhere in the neighborhood of 100 times total.

     

    I can't predict what's going to cause it or when, but it always happens the same way.

     

    1. I select an element to scrape.

    2. click the gear icon to open the advanced element selector.

    3. Sometimes it works, but other times... it says:

                             "no element found", then the advanced element selector pops up completely blank, then "Ubot has stopped working".

     

    .......around 100 times, often several times in a row.

     

    If anyone else has experienced this and has some sort of a solution, I'd really like to know what it is.

     

    Thanks.

  13. Now I'm really closed to being finished.

    I finally figured out how to scrape a tweet picture. I'm using:

    set(#didtweethaveattachment,$scrape attribute($element offset(<outerhtml=w"<img src=\"https://pbs.twimg.com/media/***.jpg\"alt=\"\" style=\"***;\">">,#offset),"fullsrc"),"Global")
     
    It works, and it pulls the url of the image that was in a tweet.  The only problem I'm still having is that because not all tweets have pictures, I don't know how to match the image urls to the tweets in my table. Does that make sense?

     

    Let's say that there's 10 tweets on a page and 5 of them have pictures, what I get looks like this:

     

    tweet 1                 image 1

    tweet 2                 image 2

    tweet 3                 image 3

    tweet 4                 image 4

    tweet 5                 image 5

    tweet 6

    tweet 7

    tweet 8

    tweet 9

    tweet 10

     

    I've been working on this thing for a week now, and I believe this is the last obstacle in my path. 

    If anyone can help me understand how to do keep my table aligned, I'd finally be finished, and I'd certainly be thankful.

  14. Also, often times when I click on the advanced element selector from $scrape attribute, I get an error that pops up like this:

     

    Cannoy deserialize the current JSON array
    (e.g[1,,2,3] into type 'System.Boolean'
    because the type requires a JSON
    primitive value (e.g. string, number,
    boolean, null) to deserialize correctly.
    To fix this error either change the JSON to
    a JSON primitive value (e.g. string,
    number, boolean, null) or change the
    deserialized type to an array or a type
    that implements a collection interface
    (e.g. Icollection, IList) like List<T> that can
    be deserialized from a JSON array.
    JsonArrayAttribute can also be added to 
    the type to foce it to deserialize from a
    Json array.
    Path "", line 1, position 1.
     
    Then the advanced selector window opens, says that there's no element found.
     
    Immediately after that, "UBot 5 has stopped working". 
     
    This sequence of events has taken place around 20 times since the start of trying to figure out scraping.
     
    Am I doing something wrong?
  15. Wow, thank you. I've spent a week attacking this project from different angles, none of which have quite gotten me 100%

     

    This is how I'm implementing what you gave me int in my bot, I've only got two problems left before I can finally stop working on this thing and start working with it.

     

    1. I can't figure out how to scrape a tweet picture, or even just a way to indicate in the table that there was originally a picture associated with the tweet would be good enough.

     

    Without knowing that, I'll be left with text tweets that don't make sense without context, but have higher like and retweet numbers, making the data not worth much. So far I haven't been able to scrape any picture info at all, but even if I could, not all tweets have pictures, so how could I keep the data straight in my table?

     

    Do you have any idea what I could do about that?

     

    UPDATE: I fixed problem 2! by scraping the innertext of the tweet header (instead of trying for links) then using regular expression to replace the inner text with @\w*. I'm learning!

     

    2. Not as big of a problem as number 1, but trying to set the variables #profile_link and #name, by scraping a different attribute of the same thing returns different results. It reads the name properly, but always pulls the href of the account who's page I'm on, rather than the author of that particular tweet.....again, it get's the name right. I'm confused.

    set(#offset,1,"Global")
    set(#keepgoing,0,"Global")
    loop while($comparison(#keepgoing,"= Equals",0)) {
        set(#picturetweet,$scrape attribute($element offset(<class="Grid Grid--withGutter">,1),"fullhref"),"Global")
        set(#name,$scrape attribute($element offset(<class="fullname js-action-profile-name show-popup-with-id">,#offset),"innertext"),"Global")
        set(#profile_link,$scrape attribute($element offset(<class="fullname js-action-profile-name show-popup-with-id">,#offset),"fullhref"),"Global")
        set(#tweet_text,$scrape attribute($element offset(<class=w"TweetTextSize TweetTextSize--*6px js-tweet-text tweet-text">,#offset),"innertext"),"Global")
        set(#retweets,$scrape attribute($element offset(<class="ProfileTweet-action--retweet u-hiddenVisually">,#offset),"innertext"),"Global")
        set(#likes,$scrape attribute($element offset(<class="ProfileTweet-action--favorite u-hiddenVisually">,#offset),"innertext"),"Global")
        add list to list(%tweetswithdeets,$list from text("{#tweet_text},,,{#retweets},,,{#likes},,,{#name},,,{#profile_link}",",,,"),"Delete","Global")
        if($comparison($list total(%tweetswithdeets),">= Greater than or equal to",4)) {
            then {
                add list to table as row(&tweettable,#offset,0,%tweetswithdeets)
            }
            else {
                clear list(%tweetswithdeets)
                set(#keepgoing,1,"Global")
            }
        }
        clear list(%tweetswithdeets)
        increment(#offset)
    }
    
    

    Thanks

  16. I honestly don't know how I could thank you enough for this. This has been sucking the life and productivity out of me for days now.

    You gave me options that I'm going to save and study, and surely apply to different things at different times in the future.

     

    I promise that when I get to the point that I've got more answers than questions, I'll take time to help people out too.

     

    I went with the remove from list option, and I made a couple of changes to meet my exact needs. Removing the last 4 lines is a consistent win. But the first two lines can vary. I made some changes in how they're handled, that seem to be working every time, providing me with additional lists of data for my table like whether the tweet was a retweet, and if so, who was it retweeted from. None of which would have been possible without your help.

    clear list(%clean)
    clear list(%retweet or not)
    set(#tweat,$next list item(%scrapedtweets),"Global")
    add list to list(%clean,$list from text(#tweat,$new line),"Delete","Global")
    comment("remove last 4 items")
    loop(4) {
        remove from list(%clean,$subtract($list total(%clean),1))
    }
    comment("remove first")
    if($contains($list item(%clean,0),"Retweeted")) {
        then {
            add item to list(%retweeted from,$find regular expression($list item(%clean,1),"@\\w*"),"Don\'t Delete","Global")
            add item to list(%retweet or not,"retweet","Don\'t Delete","Global")
            remove from list(%clean,0)
            remove from list(%clean,0)
        }
        else {
            add item to list(%retweet or not,$nothing,"Don\'t Delete","Global")
            add item to list(%retweeted from,$nothing,"Don\'t Delete","Global")
        }
    }
    set(#new tweet,%clean,"Global")
    add item to list(%final tweet,#new tweet,"Don\'t Delete","Global")
    alert(#new tweet)
    

    also I use this to play with regex

     

    http://regexhero.net/tester/

     

    it is .Net style which is same as ubot

    Thank you, that is tremendously helpful. I read that there are different dialects of regex, but I didn't know which was which.

     

    tracker issue if you can confirm this +1 it please

     

    http://tracker.ubotstudio.com/issues/966

     

    I can absolutely confirm this.

    It consistently takes just over 30 seconds to get through this loop, which is going to be a bit problematic with runs through hundreds of list items.

    I tried to confirm in the tracker, but it won't let me log in, won't let me register, won't recognize my info to send me a new password. I don't know what that's about.

     

     

    Thank you, Thank you, and Thank you again.

  17. Thank you SOOO much!

     

    This is a big advancement in the right direction. That got me something, which is a lot more than I had before, which was nothing.

     

    I just need to figure out how to modify the beginning and the end. The first line doesn't always contain "ago", the only real constant I see  that it contains an @ symbol somewhere towards the middle of that line.

     

    Also doesn't always end with ".", but it does always end with (?=\[d*\s]retweets?) or what I imagine looks something like that (but probably doesn't.)

     

    One weird thing though.....your code doesn't catch anything for me in the UBot regex editor. But when I ran it in my bot it works perfectly. 

    It's going to be hard to learn this if I can't trust the editor. When I got something to work in EditPadPro, it didn't work in my bot.

     

    Is this actually a thing that happens or am I doing something wrong? 

     

    Thanks again!

  18. Hi, this may actually drive me insane. I've been working on what should have been a simple project for days now, and I've come to a sticking point I'd really appreciate some help getting through.

     

    I have a list populated with entries like this:

     Penelope Retweeted
     I'm yer Freckleberry ‏@PicklesnPickles  46 mins46 minutes ago
    He adores my freckles.  I think I'll let him count them.
    22 retweets 32 likes
    Reply   Retweet  22   
    Like 32  
    More
    
    

    This is my first project featuring regex. So far, I've been able to select the numbers of retweets and likes from the fourth line from the bottom and add it to a table. I'm pretty happy about that.

     

    But I can't figure out how to get the tweet itself.

    I need something like "everything after the first line containing the "@", but before the fourth line from the bottom". In some way or another.

     

    This worked in testing on EditPadPro, but I can't get anything to select an item after a line break in Ubot.

    (?<=@.{1,100})(\s\n.*\s\n?.*)(?=\s\n\d{0,6}\s+retweets?\s+\d{0,}\s+likes?\s+\nReply\s+Retweet?\s+\d{0,}\w{0,1}) 

    I've tried shortening it down, cutting everything out of it, I just can't get it to cross a \n in Ubot. My head hurts and I believe that my brains might be melting, so please help a brother out here.

     

    Thanks guys.

  19. Learjet, thank you so much.

     

    Being as far out of my element as I feel, that's a tremendous help. I didn't even realize that UBot had a regex editor. (I didn't even know what regex was until yesterday. This is all still very new to me.)

     

    I don't have Regex Cheater. Thanks to you, I will in a few minutes

     

    It's just good to know that I'm not the only one who's experienced this situation. It's encouraging. Sometimes that's enough.

     

    Thanks..

  20. Ok, still working on this thing. I couldn't get separate scrapes to match up data consistently over differing circumstances. So without knowing what else to do, I turned towards regular expressions (which I know even less about)

     

    After googling, watching tutorials, etc, I came up with three regular expressions that highlight what I want in EditPad Pro, but don't work right when I try to implement them in my UBot code. It would sure be swell if someone could take a look at this and help me out. It really would.

     

    Let's start with the format of the list item I'm scraping things from:

    Tartlandia Retweeted
    Stabbatha Christy @LoveNLunchmeat Feb 5
    I don't understand the gravity of the situation because I don't understand physics.
    162 retweets 272 likes
    Reply Retweet 162
    Like 272
    More
    

    Every list item looks like that, with the following exceptions:

     

    1.Sometimes the first line isn't present

    2.Sometimes the tweet portion takes up more than one line, or includes more than one line of space.

    3.Sometimes the bottom two numbers (and only the bottom two, not the two that are in the same line together, end with "K")

    4.Sometimes the numbers have fewer digits, or more digits, or are absent entirely.

    5.Sometimes the numbers I want, the ones on the 4th line from the bottom, may contain a comma.

     

    Here's the Regex that I used to highlight the tweet portion:

    (?<=@.{1,100})(\s\n.*\s\n?.*\s\n.*\s\n.*\s\n.*)(?=\s\n\d{0,6}\s+retweets\s+\d{0,}\s+likes\s+\nReply\s+Retweet\s+\d{0,}\w{0,1})
    

    In EditPadPro, it works with 1-5 lines of tweet text and or space. In my Bot it populates a column with blank spaces when used like this:

    set(#count,0,"Global")
    set list position(%scrapedtweets,0)
    loop($list total(%scrapedtweets)) {
        set table cell(&tweettable,#count,0,$find regular expression($next list item(%scrapedtweets),"(?<=@.\{1,100\})(\\s\\n.*\\s\\n?.*\\s\\n.*\\s\\n.*\\s\\n.*)(?=\\s\\n\\d\{0,6\}\\s+retweets\\s+\\d\{0,\}\\s+likes\\s+\\nReply\\s+Retweet\\s+\\d\{0,\}\\w\{0,1\})"))
        increment(#count)
    }
    
    

    To pull the number of likes from the top line of likes (though I realize now it probably won't work if there's a comma in that number), this works in EditPad:

    (?<=\d{0,}\s+retweets\s+)\d{0,}(.*)(?=\s+likes\s+\nReply\s+Retweet\s+\d{0,}\w{0,1})
    
    

    This is what it looks like in my bot. Just like with the tweets, I get an empty column.

    set(#count,0,"Global")
    set list position(%scrapedtweets,0)
    loop($list total(%scrapedtweets)) {
        set table cell(&tweettable,#count,2,$find regular expression($next list item(%scrapedtweets),"(?<=\\d\{0,\}\\s+retweets\\s+)\\d\{0,\}(.*)(?=\\s+likes\\s+\\nReply\\s+Retweet\\s+\\d\{0,\}\\w\{0,1\})"))
        increment(#count)
    }
    
    

    The last part is the number of retweets:

    \d+(.*)(?=\s+retweets\s+\d+\s+likes\s+\nReply\s+Retweet\s+\d+\w{0,1})
    

    In my bot, this almost works!

    It populates my column with the correct numbers, except that it separates two or more digit numbers, stacking them on top of each other. Someone please help me correct this.

    set(#count,0,"Global")
    set list position(%scrapedtweets,0)
    loop($list total(%scrapedtweets)) {
        set table cell(&tweettable,#count,1,$find regular expression($next list item(%scrapedtweets),"(?<=Reply\\s\{0,3\}Retweet .*)[0-9]"))
        increment(#count)
    }
     

    This is my first ever attempt at regular expression, I feel like I'm in over my head. I guessed and tested my way through making it work in the editor. But I don't know how to even begin debugging the problem with it running in ubot.

     

    This probably belongs in the regex forum, but since it's a continuation of my original problem, I'm adding it on here. 

     

    Seriously, a little help would be really appreciated. I mean really appreciated.

  21. Here's what I've got so far. It's sloppy, I'm sure there must be a better way.

    add list to list(%scrapedtweets,$scrape attribute(<(outerhtml=w"<p class=\"TweetTextSize TweetTextSize--*6px js-tweet-text tweet-text\" lang=\"en\" data-aria-label-part=\"0\">*</p>" OR outerhtml=w"<a class=\"QuoteTweet-link js-nav\"*")>,"innertext"),"Delete","Global")
    wait(3)
    add list to list(%likes,$scrape attribute(<innertext=w"Like *">,"innertext"),"Don\'t Delete","Global")
    wait(3)
    add list to list(%retweets,$scrape attribute(<innertext=w"Retweet  *">,"innertext"),"Don\'t Delete","Global")
    set(#delete,1,"Global")
    wait(3)
    loop($list total(%retweets)) {
        if($comparison(#delete,"=",1)) {
            then {
                add item to list(%retweetcleaned,$next list item(%retweets),"Don\'t Delete","Global")
                set(#delete,0,"Global")
            }
            else {
                add item to list(%retweetduplicate,$next list item(%retweets),"Don\'t Delete","Global")
                set(#delete,1,"Global")
            }
        }
    }
    wait(3)
    set list position(%retweetcleaned,0)
    set list position(%scrapedtweets,0)
    loop($list total(%scrapedtweets)) {
        if($comparison($next list item(%retweetcleaned),"=",$next list item(%scrapedtweets))) {
            then {
                remove from list(%retweetcleaned,$next list item(%retweetcleaned))
            }
        }
    }
    

    The only way I could get the amount of retweets to scrape, scraped double. So I separated into other lists. Also if a tweet starts with the word "like" or "retweet" it scrapes the tweet into those columns also, which screws up the list synchronization. So that's what those other two loops are about.

    wait(3)
    set list position(%scrapedtweets,0)
    set list position(%likes,0)
    loop($list total(%scrapedtweets)) {
        if($comparison($next list item(%likes),"=",$next list item(%scrapedtweets))) {
            then {
                remove from list(%likes,$next list item(%likes))
            }
        }
    }
    wait(3)
    add list to table as column(&tweettable,0,0,%scrapedtweets)
    wait(3)
    add list to table as column(&tweettable,0,1,%retweetcleaned)
    wait(3)
    add list to table as column(&tweettable,0,2,%likes)
    

    This seems awfully crude, and more importantly, it's still not able to distinguish between an original tweet and a retweet, and it doesn't recognize if a tweet had a picture attached. But it's my best effort so far.

     

    I'd still, really, really appreciate any guidance anyone can offer me. This is just the beginning of the scraping I want to do, and I want to make sure I learn best practices to apply to the next one, once I finally get this working right.

     

    Is this a job better suited to regular expressions? 

     

    Thanks

  22. Well over 12 hours into this particular problem and all I've been able to accomplish is scraping tweets with no additional data:

    add list to table as column(&tweettable,0,0,$scrape attribute(<outerhtml=w"<p class=\"TweetTextSize TweetTextSize--16px js-tweet-text tweet-text\" lang=\"en\" data-aria-label-part=\"0\">*</p>">,"innertext"))
    
    

    or 

    add list to table as column(&tweettable,0,1,$scrape attribute(<data-tweet-id=w"*">,"innertext"))
    
    

    that returns everything together leaving me with the problem of separation. I haven't been able to reliably scrape any one thing separately, other than the tweet itself. 

     

    I just want to grab number of likes and retweets, along with whether or not the tweet itself was retweeted and put it all into a table.

     

    Someone please put me out of my misery.

  23. You've already helped me tremendously by telling me to focus on scraping separately instead of trying to separate afterwards. I've spent a fair amount of time and frustration trying to do it the wrong way. 

     

    Forgive me, if I copied too much. I'm sure that not knowing exactly which parts I need makes up a big part of my current problem.

    <div class="tweet original-tweet js-original-tweet js-stream-tweet js-actionable-tweet js-profile-popup-actionable  
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    " data-tweet-id="695704208687894530" data-disclosure-type="" data-item-id="695704208687894530" data-permalink-path="/Susanjmccann/status/695704208687894530" data-retweet-id="695704319836839936" data-screen-name="Susanjmccann" data-name="Susan McCann" data-user-id="313199716" data-expanded-footer="<div class="js-tweet-details-fixer tweet-details-fixer">
    
      <div class="js-machine-translated-tweet-container"></div>
        <div class="js-tweet-stats-container tweet-stats-container ">
        </div>
    
      <div class="client-and-actions">
      <span class="metadata">
        <span>12:23 p.m. - 5 Feb 2016</span>
           &middot; <a class="permalink-link js-permalink js-nav" href="/Susanjmccann/status/695704208687894530"  tabindex="-1">Details</a>
      </span>
    </div>
    
    
    </div>
    " data-mentions="KnowledgeBishop" data-retweeter="MeemsterB" data-you-follow="false" data-follows-you="false" data-you-block="false">
    
    
        <div class="context">
          
              <div class="tweet-context with-icn
        
        ">
    
          <span class="Icon Icon--small Icon--retweeted"></span>
    
    
    
    
    
    
                <span class="js-retweet-text"><a class="pretty-link js-user-profile-link" href="/MeemsterB" data-user-id="2835431720"><b>MeemsterB</b></a> Retweeted</span>
    
    
          
    
        </div>
    
    
        </div>
    
        <div class="content">
          
    
          
          <div class="stream-item-header">
              <a class="account-group js-account-group js-action-profile js-user-profile-link js-nav" href="/Susanjmccann" data-user-id="313199716">
        <img class="avatar js-action-profile-avatar" src="https://pbs.twimg.com/profile_images/428117971139981312/9AWbrFsZ_bigger.jpeg" alt="">
        <strong class="fullname js-action-profile-name show-popup-with-id" data-aria-label-part="">Susan McCann</strong>
        <span>‏</span><span class="username js-action-profile-name" data-aria-label-part=""><s>@</s><b>Susanjmccann</b></span>
        
      </a>
    
            <small class="time">
      <a href="/Susanjmccann/status/695704208687894530" class="tweet-timestamp js-permalink js-nav js-tooltip" data-original-title="12:23 p.m. - 5 Feb 2016"><span class="_timestamp js-short-timestamp js-relative-timestamp" data-time="1454703783" data-time-ms="1454703783000" data-long-form="true" aria-hidden="true">1 day</span><span class="u-hiddenVisually" data-aria-label-part="last">1 day ago</span></a>
    </small>
    
              
              
          </div>
    
          
            <p class="TweetTextSize TweetTextSize--16px js-tweet-text tweet-text" lang="en" data-aria-label-part="0">May your faith be unshakeable and your will, unbreakable. - <a href="/KnowledgeBishop" class="twitter-atreply pretty-link js-nav" dir="ltr" data-mentioned-user-id="117477071"><s>@</s><b>KnowledgeBishop</b></a> <a href="/hashtag/quote?src=hash" data-query-source="hashtag_click" class="twitter-hashtag pretty-link js-nav" dir="ltr"><s>#</s><b>quote</b></a></p>
    
    
    
    
          
            
    
          
          
      <div class="expanded-content js-tweet-details-dropdown">
        
          
      </div>
    
    
          
          <div class="stream-item-footer">
      
    
      
            <div class="ProfileTweet-actionCountList u-hiddenVisually">
        
        <span class="ProfileTweet-action--reply u-hiddenVisually"></span>
        <span class="ProfileTweet-action--retweet u-hiddenVisually">
          
          <span class="ProfileTweet-actionCount" data-tweet-stat-count="2">
            <span class="ProfileTweet-actionCountForAria" data-aria-label-part="">2 retweets</span>
          </span>
        </span>
        <span class="ProfileTweet-action--favorite u-hiddenVisually">
          <span class="ProfileTweet-actionCount" data-tweet-stat-count="2">
            <span class="ProfileTweet-actionCountForAria" data-aria-label-part="">2 likes</span>
          </span>
        </span>
      </div>
    
        <div class="ProfileTweet-actionList js-actions" role="group" aria-label="Tweet actions">
          <div class="ProfileTweet-action ProfileTweet-action--reply">
      <button class="ProfileTweet-actionButton u-textUserColorHover js-actionButton js-actionReply" data-modal="ProfileTweet-reply" type="button">
        <div class="IconContainer js-tooltip" title="Reply">
          <span class="Icon Icon--reply"></span>
          <span class="u-hiddenVisually">Reply</span>
        </div>
      </button>
    </div>
          <div class="ProfileTweet-action ProfileTweet-action--retweet js-toggleState js-toggleRt">
      <button class="ProfileTweet-actionButton  js-actionButton js-actionRetweet" data-modal="ProfileTweet-retweet" type="button">
        <div class="IconContainer js-tooltip" title="Retweet">
          <span class="Icon Icon--retweet"></span>
          <span class="u-hiddenVisually">Retweet</span>
        </div>
          <div class="IconTextContainer">
            <span class="ProfileTweet-actionCount">
              <span class="ProfileTweet-actionCountForPresentation" aria-hidden="true">2</span>
            </span>
          </div>
      </button><button class="ProfileTweet-actionButtonUndo js-actionButton js-actionRetweet" data-modal="ProfileTweet-retweet" type="button">
        <div class="IconContainer js-tooltip" title="Undo retweet">
          <span class="Icon Icon--retweet"></span>
          <span class="u-hiddenVisually">Retweeted</span>
        </div>
          <div class="IconTextContainer">
            <span class="ProfileTweet-actionCount">
              <span class="ProfileTweet-actionCountForPresentation" aria-hidden="true">2</span>
            </span>
          </div>
      </button>
    </div>
          <div class="ProfileTweet-action ProfileTweet-action--favorite js-toggleState">
      <button class="ProfileTweet-actionButton js-actionButton js-actionFavorite" type="button">
        <div class="IconContainer js-tooltip" title="Like">
          <div class="HeartAnimationContainer">
            <div class="HeartAnimation"></div>
          </div>
          <span class="u-hiddenVisually">Like</span>
        </div>
          <div class="IconTextContainer">
            <span class="ProfileTweet-actionCount">
              <span class="ProfileTweet-actionCountForPresentation" aria-hidden="true">2</span>
            </span>
          </div>
      </button><button class="ProfileTweet-actionButtonUndo u-linkClean js-actionButton js-actionFavorite" type="button">
        <div class="IconContainer js-tooltip" title="Undo like">
          <div class="HeartAnimationContainer">
            <div class="HeartAnimation"></div>
          </div>
          <span class="u-hiddenVisually">Liked</span>
        </div>
          <div class="IconTextContainer">
            <span class="ProfileTweet-actionCount">
              <span class="ProfileTweet-actionCountForPresentation" aria-hidden="true">2</span>
            </span>
          </div>
      </button>
    </div>
          
    
            <div class="ProfileTweet-action ProfileTweet-action--more js-more-ProfileTweet-actions">
        <div class="dropdown">
      <button class="ProfileTweet-actionButton u-textUserColorHover dropdown-toggle js-dropdown-toggle" type="button">
          <div class="IconContainer js-tooltip" title="More">
            <span class="Icon Icon--dots"></span>
            <span class="u-hiddenVisually">More</span>
          </div>
      </button>
      <div class="dropdown-menu">
      <div class="dropdown-caret">
        <div class="caret-outer"></div>
        <div class="caret-inner"></div>
      </div>
      <ul>
          <li class="share-via-dm js-actionShareViaDM" data-nav="share_tweet_dm">
            <button type="button" class="dropdown-link">Share via Direct Message</button>
          </li>
        
          <li class="copy-link-to-tweet js-actionCopyLinkToTweet">
            <button type="button" class="dropdown-link">Copy link to Tweet</button>
          </li>
          <li class="embed-link js-actionEmbedTweet" data-nav="embed_tweet">
            <button type="button" class="dropdown-link">Embed Tweet</button>
          </li>
              <li class="mute-user-item pretty-link"><button type="button" class="dropdown-link">Mute</button></li>
      <li class="unmute-user-item pretty-link"><button type="button" class="dropdown-link">Unmute</button></li>
    
            <li class="block-link js-actionBlock" data-nav="block">
              <button type="button" class="dropdown-link">Block</button>
            </li>
            <li class="unblock-link js-actionUnblock" data-nav="unblock">
              <button type="button" class="dropdown-link">Unblock</button>
            </li>
            <li class="report-link js-actionReport" data-nav="report">
              <button type="button" class="dropdown-link">
                
                Report
              </button>
            </li>
      </ul>
    </div>
    
    </div>
    
      </div>
    
        </div>
    
    </div>
      
    
    
    
          
          
    
        </div>
      </div>
    
×
×
  • Create New...