Jump to content
UBot Underground

problems using $replace


Recommended Posts

Hi everyone,

 

I am having a bit of a problem with my bot again and trying to use the $replace command. I have a list of url's that I have scraped. For some reason when I scrape them there are a few lines that have this "javascript:void(0)" instead of a website url. So what I want to do is load in this file of url's and use the $replace command to replace all the lines that say "javascript:void(0)" with $nothing. I'd then like to save the scrubbed url's list back out to a file. Please if anyone knows how I can accomplish this, please let me know. Thanks again,

 

Chris

Link to post
Share on other sites

Hey Lily,

 

Here is my google maps scraper as of right now. You may need to change file paths in it though for it to work. So the problem I'm having is in the scraping links script. I have a sub named "delete bad links" that I'm trying to get to load the file, search through it and delete the bad line and then save the scrubbed list of urls back out to the file.

 

So when you go to run it just enter a city like "los angeles" and a business type like "dentist" and then run it and it will search google maps and get the links but when it saves them to a file you will see the file has at least one line in it that says "javascript:void(0)" rather than a url. Those are the lines that I want to remove from the list.

 

Thanks again Lily, your help has been amazing:)

 

google maps scraper.ubot

Link to post
Share on other sites

I never got any of the javascript junk when I tried it. It might be because I was logged into an account. Perhaps you would get cleaner links if you had an account and you were logged in.

 

The add to list part with the replace constant:

 

There are no string(you know, the numbers in the squigly lines? {1}{2}{3} etc.) referring to the appropriate constant. Why is that? Did you delete them?

 

 

In the Scraping Links script, I made some modification to make things look less cluttery and make your script work a bit better.

 

at the end of the script, I made some modification to the "add to list" and added two set commands above it.

I set a variable called "file text" to the constant $read file with the file containing your scraped list of urls.

 

I then created another set command where I set a variable called "modified text" to the replace constant, where the original text is the variable "file text" (remember it has the $read file set to the file containing your scraped urls), under Search, put the thing you are trying to replace(javascript:void(0)), and then under place, put in the $new line constant like you wanted.

 

In your "add to list", you would then out the name of the list you are trying to create after cleaning your list, and then in the content will be the constant $list from text with the variable #modified text under text in the $list from text parameter window.

 

The delimiter would be the $new line constant

 

I'm attaching the modified bot. let me know how it goes.

 

Modified.ubot

 

 

 

 

Hey Lily,

 

Here is my google maps scraper as of right now. You may need to change file paths in it though for it to work. So the problem I'm having is in the scraping links script. I have a sub named "delete bad links" that I'm trying to get to load the file, search through it and delete the bad line and then save the scrubbed list of urls back out to the file.

 

So when you go to run it just enter a city like "los angeles" and a business type like "dentist" and then run it and it will search google maps and get the links but when it saves them to a file you will see the file has at least one line in it that says "javascript:void(0)" rather than a url. Those are the lines that I want to remove from the list.

 

Thanks again Lily, your help has been amazing:)

 

google maps scraper.ubot

Link to post
Share on other sites

Hi Lily,

 

Thanks for that great info. I have tried logging in to my google account through the script and then scraping the links but I am still having the javascript problems. I attached the txt file that has the line of javascript code in it. If you know how I could search the text file and remove those lines that would be great. Thanks again:)

 

business links.txt

Link to post
Share on other sites

Hi Lily,

 

Thanks for that great info. I have tried logging in to my google account through the script and then scraping the links but I am still having the javascript problems. I attached the txt file that has the line of javascript code in it. If you know how I could search the text file and remove those lines that would be great. Thanks again:)

 

business links.txt

 

 

Ok, I understand better now. I thought you meant it was attached to the urls. I am taking a look now.

Link to post
Share on other sites

Hi Lily,

 

Thanks for that great info. I have tried logging in to my google account through the script and then scraping the links but I am still having the javascript problems. I attached the txt file that has the line of javascript code in it. If you know how I could search the text file and remove those lines that would be great. Thanks again:)

 

business links.txt

 

 

Okey, this solution works. I have isolated it for you to run it separately and then implement if into your script later. Instead of replacing the void with a $newline I replaced it with a $nothing constant. You can change that as you wish.

 

let me know how it goes.

 

Cleaning the javascript.ubot

Link to post
Share on other sites

THANK YOU!!! That works great for what I need to do. Now my question is that after I savethose url's I have another script that loads those urls and navigates to each url. However when the javascript gets removed from the list it leaves a blank line in the list instead, so the script stops and errors when it gets to the blank line because it doesn't know how to navigate to a blank site. So is there any way to remove that line entirely and just keep a continous flow of url's. I also attached my scrubbed list of url's. Thanks again for your amazing help:)

 

business links.txt

Link to post
Share on other sites

Nevermind. I got it to work. I just have it replacing the javascript error lines with a dummy url that just goes to google.com so that it doesn't error out on those lines. Thanks again for all the help Lily:)

 

great job! I don't know why that didn't come to mind! lol *sigh*

Link to post
Share on other sites
  • 1 month later...

Nevermind. I got it to work. I just have it replacing the javascript error lines with a dummy url that just goes to google.com so that it doesn't error out on those lines. Thanks again for all the help Lily:)

 

Hi Chris,

Wondering if you completed this project. Is it working? I am trying to do something similar and would appreciate your input. Here is one of my questions: Why did you decide to use maps.google.com instead of simply using google.com? And do you feel it was the right choice?

I am a new user and new to the forum, not sure if there are restrictions on PM or ...

Look forward to your response.

 

Steve

Link to post
Share on other sites
  • 1 year later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...