Jump to content
UBot Underground

awesome sauce

Members
  • Content Count

    22
  • Joined

  • Last visited

Everything posted by awesome sauce

  1. No. There is nothing unique like that on each product that is the same across all sites.
  2. I was wondering how you would combine multiple data sets to ensure that the data actually matches? For example, say you were scraping retail websites for prices and wanted to make a price comparison website. How would you combine the data so that "Hasbro Hulk Action Figure" and "6-inch Hulk Figure" match if they are the same product, just named differently on the different websites? I don't really need specific code, I was more just wondering how one would go about combining multiple data sets without doing it manually?
  3. Alright, so I changed the list so it doesn't delete duplicates. However, for some reason everything is being duplicated... a lot. Out of less than 52 items, the following code returned 1144 emails (many duplicated and the blank lines aren't there). define Clean Business Data { clear list(%email_address_clean) loop($list total(%email_address)) { add list to list(%email_address_clean,$list from text($replace(%email_address,"mailto:",$nothing),$new line),"Don\'t Delete","Global") } }
  4. That helped me be able to remove mailto:, however it's removing my blank values from my first list. Example: %email_address has 118 values, some of which are filled with $nothing because there isn't an email for that. %email_address_clean then only has 44 items and has removed all the lines with $nothing. Is it possible to use replace but keep the $nothing values intact? (then I should have 118 items in my list, even though I just have 44 email addresses) Here is the code I'm using to remove mailto: define Clean Business Data { loop($list total(%email_address)) { add lis
  5. I'm scraping a website that when I scrape the email addresses they have 'mailto:' at the start. What I would like to do is remove 'mailto:' so I just have the address. The email addresses are currently stored in a list. Here's the code I have, but it's not doing anything. I can't quite figure it out because I can't use replace with the if statement. Any suggestions? define Clean Business Data { loop($list total(%email_address)) { set(#email_row,$next list item(%email_address),"Global") set(#email_row_clean,$replace(#email_row,"mailto:",""),"Global") } }
  6. @Learjet those are the exact errors I keep getting. @Brutal is it more difficult than unchecking the box? I tried that and the errors still seem to pop up or the program just hangs. @Pash it was. UBot support won't help with this, even though there is nothing online to be found about it.
  7. I hate to have to make a post about this, but it looks like it's the only way I can solve my problems with this software. Basically every script that I make ends up throwing out errors. I don't know what the errors mean, as there is nothing in the documentation about it, there is nothing on Google about it, and support apparently can't help either. So I made a ticket about a script that keeps failing in the exact same spot. I provided the script. Buddy at support asked me for the the keyword and zip I used to get the error to come up every time. I gave that, then he runs the script a few
  8. Ardley216, were you able to use the new command? When I use it I get a script error: The method or operation is not implemented.
  9. Where are the notes? The trackers don't seem to be updated.
  10. Thank you. I actually figure it out. I make custom commands. So what I did was loop the custom command if class next is present, then once it's on the last page I just run the custom command outside of the loop.
  11. I am having a bit of problem navigating multiple pages on a site with UBot. I've used a loop that will look for class=next, and if it finds it it will run my scrape function. However, I am missing the last page of data. How would I visit all of the pages, INCLUDING the last page?
  12. So to add a little bit more about testing my script about 5 times before failure. It seems that the more times I 'test' a script in UBot Studio, the more likely it is to fail. For example, I modified the script in my last post and ran it a couple of times. The first time I run it I will get 268 items (and is lightening fast), but if I run it anymore than that one time in UBot Studio, the script misses items to scrape, or just fails by freezing or not completing (and is very slow). Why does UBot do this?
  13. OMG thank you so much! That works perfectly and really saves me some time scraping data. Is it also possible just to clean the 'urls' list without making another list? I know you said the way above it likely easiest, but I like to know other ways to do things as well.
  14. I was just wondering if it's possible to clean your data with UBot? For example, say you have a list with mostly the same URLs, but some of those URLs are also from different domains. Can you tell UBot to remove URLs from the list if they don't contain 'google.com'? How would this look in code?
  15. Sure. I can tell you that the software is unstable. No matter what simple scripts I make on any website, they crash, fail, and throw errors. It seems to me that it's either something with a loop or a list, but I don't know. Here's another script that I made that fails half way through. clear list(%business_page_urls) ui text box("Business Name or Keyword:",#keyword) ui text box("City or Zip Code:",#location) navigate("http://www.yellowpages.com/","Wait") wait for browser event("Everything Loaded","") type text(<name="search_terms">,#keyword,"Standard") type text(<name="geo_loca
  16. Hmmm... I still can't get this to work. If I use change attribute, seemingly nothing happens. EDIT: I got it to work using deliter's suggestion. However, instead of 'innerhtml', 'innertext' works on change attribute.
  17. I would love a tutorial or something too...
  18. Maybe it's just the server I'm running it on. Does anyone else here run UBot on AWS? If so, how's your performance? I just rebooted my server again and started the script. On about the 9th object to scrape UBot froze. I had the performance monitor up and it appears that my processor is getting maxed out. This page is indicating that I may not even have enough RAM to run UBot. However, looking around the net, it appears other people are running UBot with less than the suggested 2GB RAM. I'm not sure what version they are running though. Is anyone running version 5 with ~1GB RAM? Should I
  19. I've been following the below tutorial to make what I would consider a very basic bot: However, I can't get the bot to complete without crashing or throwing errors. Here's one error I've got: Error converting value True to type 'System.Collection.Generic.List'1[system.String]'.Path", line, position 4 The script is only trying to grab 36 items. Here's my version of the script: ui stat monitor("Movie Titles:",$list total(%movie titles)) ui stat monitor("Thumbnails:",$list total(%thumbnailurls))ui stat monitor("Full Size:",$list total(%largeimageurl))define Scrape Data { clear tabl
  20. Hi, I'm trying to scrape a website that shows a lot of ads around the listings. I'm only trying to scrape the organic listings. The way the site is set up it has different classes for the ads and the organic listings. So what I am trying to do is select the "organic" class, then select all of the "business" classes that are all inside of the "organic" class and save them to a list. Is this possible with UBot? P.S. The "business" class exists in the ads classes as well, so I need to do it this way I think.
  21. thanks! That totally worked. I think my problem had to do with the offset.
  22. Hi guys, I just bought Ubot the other day and am working on my first project. I would like to scrape the first page of google for a query and get all of the links. This is what I have so far for the scraping part of the program. I for some reason can't extract the URL from class r, even though it appears to me by looking in the source code this should work. clear list(%scraped urls) add list to list(%scraped urls,$scrape attribute($element offset(<class="r">,0),"fullhref"),"Don\'t Delete","Global")Hopefully it's just a noob mistake that's a simple fix.
×
×
  • Create New...