chris weber 0 Posted August 19, 2010 Report Share Posted August 19, 2010 Hi everyone, I am having a bit of a problem with my bot again and trying to use the $replace command. I have a list of url's that I have scraped. For some reason when I scrape them there are a few lines that have this "javascript:void(0)" instead of a website url. So what I want to do is load in this file of url's and use the $replace command to replace all the lines that say "javascript:void(0)" with $nothing. I'd then like to save the scrubbed url's list back out to a file. Please if anyone knows how I can accomplish this, please let me know. Thanks again, Chris Quote Link to post Share on other sites
MiriamMB 63 Posted August 19, 2010 Report Share Posted August 19, 2010 hey Chris, I can take a look at your script privately for you. Just send me a message. Quote Link to post Share on other sites
chris weber 0 Posted August 19, 2010 Author Report Share Posted August 19, 2010 Hey Lily, Here is my google maps scraper as of right now. You may need to change file paths in it though for it to work. So the problem I'm having is in the scraping links script. I have a sub named "delete bad links" that I'm trying to get to load the file, search through it and delete the bad line and then save the scrubbed list of urls back out to the file. So when you go to run it just enter a city like "los angeles" and a business type like "dentist" and then run it and it will search google maps and get the links but when it saves them to a file you will see the file has at least one line in it that says "javascript:void(0)" rather than a url. Those are the lines that I want to remove from the list. Thanks again Lily, your help has been amazing:) google maps scraper.ubot Quote Link to post Share on other sites
MiriamMB 63 Posted August 19, 2010 Report Share Posted August 19, 2010 I never got any of the javascript junk when I tried it. It might be because I was logged into an account. Perhaps you would get cleaner links if you had an account and you were logged in. The add to list part with the replace constant: There are no string(you know, the numbers in the squigly lines? {1}{2}{3} etc.) referring to the appropriate constant. Why is that? Did you delete them? In the Scraping Links script, I made some modification to make things look less cluttery and make your script work a bit better. at the end of the script, I made some modification to the "add to list" and added two set commands above it. I set a variable called "file text" to the constant $read file with the file containing your scraped list of urls. I then created another set command where I set a variable called "modified text" to the replace constant, where the original text is the variable "file text" (remember it has the $read file set to the file containing your scraped urls), under Search, put the thing you are trying to replace(javascript:void(0)), and then under place, put in the $new line constant like you wanted. In your "add to list", you would then out the name of the list you are trying to create after cleaning your list, and then in the content will be the constant $list from text with the variable #modified text under text in the $list from text parameter window. The delimiter would be the $new line constant I'm attaching the modified bot. let me know how it goes. Modified.ubot Hey Lily, Here is my google maps scraper as of right now. You may need to change file paths in it though for it to work. So the problem I'm having is in the scraping links script. I have a sub named "delete bad links" that I'm trying to get to load the file, search through it and delete the bad line and then save the scrubbed list of urls back out to the file. So when you go to run it just enter a city like "los angeles" and a business type like "dentist" and then run it and it will search google maps and get the links but when it saves them to a file you will see the file has at least one line in it that says "javascript:void(0)" rather than a url. Those are the lines that I want to remove from the list. Thanks again Lily, your help has been amazing:) google maps scraper.ubot Quote Link to post Share on other sites
chris weber 0 Posted August 19, 2010 Author Report Share Posted August 19, 2010 Hi Lily, Thanks for that great info. I have tried logging in to my google account through the script and then scraping the links but I am still having the javascript problems. I attached the txt file that has the line of javascript code in it. If you know how I could search the text file and remove those lines that would be great. Thanks again:) business links.txt Quote Link to post Share on other sites
MiriamMB 63 Posted August 19, 2010 Report Share Posted August 19, 2010 Hi Lily, Thanks for that great info. I have tried logging in to my google account through the script and then scraping the links but I am still having the javascript problems. I attached the txt file that has the line of javascript code in it. If you know how I could search the text file and remove those lines that would be great. Thanks again:) business links.txt Ok, I understand better now. I thought you meant it was attached to the urls. I am taking a look now. Quote Link to post Share on other sites
MiriamMB 63 Posted August 19, 2010 Report Share Posted August 19, 2010 Hi Lily, Thanks for that great info. I have tried logging in to my google account through the script and then scraping the links but I am still having the javascript problems. I attached the txt file that has the line of javascript code in it. If you know how I could search the text file and remove those lines that would be great. Thanks again:) business links.txt Okey, this solution works. I have isolated it for you to run it separately and then implement if into your script later. Instead of replacing the void with a $newline I replaced it with a $nothing constant. You can change that as you wish. let me know how it goes. Cleaning the javascript.ubot Quote Link to post Share on other sites
chris weber 0 Posted August 20, 2010 Author Report Share Posted August 20, 2010 THANK YOU!!! That works great for what I need to do. Now my question is that after I savethose url's I have another script that loads those urls and navigates to each url. However when the javascript gets removed from the list it leaves a blank line in the list instead, so the script stops and errors when it gets to the blank line because it doesn't know how to navigate to a blank site. So is there any way to remove that line entirely and just keep a continous flow of url's. I also attached my scrubbed list of url's. Thanks again for your amazing help:) business links.txt Quote Link to post Share on other sites
chris weber 0 Posted August 20, 2010 Author Report Share Posted August 20, 2010 Nevermind. I got it to work. I just have it replacing the javascript error lines with a dummy url that just goes to google.com so that it doesn't error out on those lines. Thanks again for all the help Lily:) Quote Link to post Share on other sites
MiriamMB 63 Posted August 20, 2010 Report Share Posted August 20, 2010 Nevermind. I got it to work. I just have it replacing the javascript error lines with a dummy url that just goes to google.com so that it doesn't error out on those lines. Thanks again for all the help Lily:) great job! I don't know why that didn't come to mind! lol *sigh* Quote Link to post Share on other sites
Wisdom4U 0 Posted September 28, 2010 Report Share Posted September 28, 2010 Nevermind. I got it to work. I just have it replacing the javascript error lines with a dummy url that just goes to google.com so that it doesn't error out on those lines. Thanks again for all the help Lily:) Hi Chris,Wondering if you completed this project. Is it working? I am trying to do something similar and would appreciate your input. Here is one of my questions: Why did you decide to use maps.google.com instead of simply using google.com? And do you feel it was the right choice? I am a new user and new to the forum, not sure if there are restrictions on PM or ...Look forward to your response. Steve Quote Link to post Share on other sites
tbc 0 Posted October 6, 2011 Report Share Posted October 6, 2011 Is there a Google Maps Scraper for Ubot 4? These files only work for Ubot 3. Or does exist a conversion programm for old ubot.files? Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.