Jump to content
UBot Underground

smb1970

Fellow UBotter
  • Content Count

    20
  • Joined

  • Last visited

Everything posted by smb1970

  1. Hi, What's http get? Best regards Steve
  2. Thanks, I don't have scrape box.
  3. Hi, I have a list of around 20,000 domain names and I'm looking to get email addresses for them, I would imagine by reading the website and looking for a contact page. It doesn't have to be perfect a 20% hit rate would be fine. Anyone got anything? I'm prepared to swap for a working alpha state Yell scraper. Best regards Steve
  4. Hi Jeff, That's a big ask. There are people out there marketing Yellow Pages scrapers and while mine is not robust enough to release commercially in it's current state, it is a solid and reliable bot that I've already used to obtain thousands of pages from Yell. There's really nothing to stop you taking what I've done, giving it a quick polish and releasing it yourself tomorrow. I'm happy to help out with advice if you have specific issues with your own development though. Feel free to PM me. Best regards Steve
  5. Hi, I seem to be having a problem with the keyboard event command not always working. I'm giving focus to the application in question but issuing the keyboard event command seems to do nothing. Pressing the required key manually is working fine. Anyone else ever experience this problem? Best regards Steve
  6. Thanks for the tip. I'll give it a go. Best regards Steve edit: I added in these lines. I also added in a clear cookies command at the end of each cycle, an extra wait for 2 seconds before each page scrape and switched it so the proxy change only happens at the end of each location scrape. Seems to work a lot better now, unfortunately it takes a little longer and I still need to bomb proof the HMA proxy switch but other than that I appear to have a sweet little Yell scraper now.
  7. HI, I've been working on a Yellow Pages scraper the last few days. It pretty much works but the browser seems to lock out every now and then. So the click event on the Element to Click for the next page won't work for some reason. Stopping the script and trying to manually click on the web page doesn't work either, if I switch to another web site manually in the internal browser everything is fine again. It's almost as though Yell can tell they are being scraped and are somehow stopping me. Is this possible or is is potentially an issue with my code. I'm using Hide My Ass Pro and switching t
  8. I managed to achieve this with default commands in Ubot standard. It's simply a case of using the windows commands. Here's how.
  9. Hi, I'd be interested in knowing how you can change the IP address of Hide My Ass from within ubot please? Best regards Steve
  10. This is a really useful video, thanks for sharing it. Are screen positions relative to the top of the program not at the mercy of the individual users settings? I think for some applications it might be better to use keystrokes rather than try and interpret where the mouse might need to be. Best regards Steve
  11. Hi, I have come across a problem with a Yellow Pages scraper that seems only to occur when I'm using Run but does not occur when I'm using Step. Here's what I have ( attached ). So, I'm running the loop until it can't see the "Next" link for the next page, then after scraping the page and populating my lists ( via custom commands ) I click the next link, wait for the page to load and go back to the top to scan the next page. I then repeat the same process after the loop to retrieve the last page ( not shown ). Main list ( the one I'm clearing ) is the list that contains the scrape for
  12. Thanks to everyone that provided assistance on this thread. I've subsequently managed to write my first Ubot - a Yellow Pages scraper which works nicely....until you get to page 10 and realise they are limiting results returning. Looks like I'll have to learn how to read a file and narrow down the search areas in order to scrape more data. Best regards Steve
  13. Excellent, much appreciated it would be useful if that was offered as an alternative to the video tutorials ( for example at the top of the video tutorial page ). Best regards Steve
  14. Thanks for the tips. Yes I can program and yes I did just jump straight in, I learn better that way, which I agree can be frustrating for the people I ask for support. My apologies if it's annoying. I have watched some of the videos but I don't consume video tuition particularly well, especially not when I would describe it as verbose. If there was a ubot book I would happily sit down and read through it in an hour or two. However watching a 20 minute video to glean information that can be conveyed in a couple of diagrams and a few lines of explanatory text isn't my idea of time well spent
  15. Okay, after much head scratching and reading of regex pages I finally managed to construct a string (?<=<div class="parentListing ui-draggable")(?s)(.*?)(?=</div> <div class="pusherDiv">) This gets may data out into a list of 15 elements. My next head scratch is how do I parse through each of these 15 strings in order, checking for the relevant fields and setting null values if a field is not present in that element so that at the end I have a nice file with 15 rows with all the data lined up. My main problem lies in the fact that I have no idea how to deal with each
  16. Thanks for the suggestion. It still doesn't help unless you are implying that I scrape the entire page of html code from the first occurrence of my string to the bottom of the page - as that's the only way given my current understanding of regular expressions. Best regards Steve
  17. Okay thanks for the responses folks. I've tried to attach the file and it won't let me put it on the original post and I can't see any way to attach a file to this response. So here's a section of it below. Basically what I am trying to do is to pull each record from the Yellow Pages using <div class="parentListing which seems to be the only way to guarantee you get all the data for a record. I'm then planning to slice and dice it in Ubot to get the relevant fields I need. This to me also seems to be the way to ensure I get the correct data for each record as a simple scrape puts all the d
  18. Hi, I'm trying to set up a regular expression string to basically scrape a chunk or HTML between two set points. However between these two points is an irregular number of lines. Now I've watched the Regex video and I can see a way to do it. However I'm looking for a foolproof way to do it. I tried (.*\n)* and then using a lookahead assertion but it didn't work Without the lookahead assertion it did work but obviously pulled up everything after the initial string I specified. Can anyone give me any suggestions on how to do this please? Best regards Steve
  19. Hi, Just bought a copy of Ubot Studio and I'm looking to write a Yellow Pages scraper and a Google Places scraper initially. I had a play around with writing a Yellow Pages scraper and managed to get a program that creates a nice text file but there are a few issues and to resolve them I think I need to understand how the scraping works. First problem, the entries are not in the same order they are on the page - which I would expect as I'd anticipate that the scraper just goes down the page. Is this not how it works? Or is it a case that my lists in Ubot studio are being stored unordered?
×
×
  • Create New...