Jump to content
UBot Underground

How to replace


Recommended Posts

I would greatly appreciate some guidance in getting this done.

 

I am scraping some URLs from a website and I am saving them to a file. However, the URL string has a bunch of parameters that I want to retain as well using the "&" delimiter.

 

When I examine my text file it shows the full URL except the "&" was changed to "&"

 

Is there a way to automatically change "&" back to "&" ?

 

Thanks!

ubot-006.JPG

Link to post
Share on other sites

I just noticed while setting up a page scrape that the & seems to be changing withing uBot itself. No other characters are being affected just the & seems to changed to "&"

 

Any ideas?

Link to post
Share on other sites

I've encountered this in the past too. Try scraping the attribute differently. Sometimes moving to the next level on the html code on the top of the box that comes up when you right click the thing on the page you want to scrape can do it.

 

An alternative is to, after you scrape, create a tempoary list/variable and do a loop if it's a list, then perform a simple select each element and do a replace and save it to the temp variable/loop. Clear the original list and copy the temp list to the list.

 

Done!

 

Frank

Link to post
Share on other sites

Could you use regex to solve this problem? If I am understanding you, you could take the scraped url and do a bit of replacing in the $eval command.

 

After adding your scraped url, you would ultimately have this in your $eval node: "{1}".replace(/\&/g,"");

 

The g after the second forward slash isn't needed if you only need to replace once.

 

I'm really sorry if you already knew this and I was just misunderstanding what you wanted.

Link to post
Share on other sites

@Frank

Yeah, I tried that but no workie. The url is still changed. No matter how far back I back up the screen grab it still changes the "&" to "&"

 

I also tried splitting it out but no success.

 

@alcr

I would be interested to see that example. I thought I have seen all of the examples but I guess I could have missed it.

 

@theskinnys

Hmmmmm....interesting. I may play with this today. I would love to have a quick solution. Thanks for the nudge.

 

Thanks again everyone for replying!

 

Buddy

Link to post
Share on other sites

Are there any examples in here as to the proper way to use regex?

 

Not sure. I learned the hard way. I went through the script library posted by someone here and basically just went through online guides to learn how it works. The regex is all javascript so if you want to learn it'll have to be through javascript regex.

 

I had already done the regex in a ubot script. I added some comments just now so I might as well post the script up. The comments are kind of verbose and assume the viewer doesn't know any javascript/programming, but in case someone else needs the same thing it would be best to over-comment.

replace.ubot

  • Like 1
Link to post
Share on other sites

I'm still stuck here. Doing the fancy coding here seems to work BUT it would make better sense to be able to add to a list the href contents. When I Choose by attribut and select href the link and all parameters appear PERFECTLY. However, when I select via the Add to List path the URL is altered. I believe this is being done in the rendering portion of the control being implemented. After all, why isn't the href altering the URL when choosing by attribute.

 

If I could Add to List using an href option then THAT my friends would solve my problem perfectly.

 

I hope I am not picking at a sore spot here. I jut think if I am scraping URLs then they should be left intact. If this was never considered then maybe the power in charge could somehow incorporate it into a newer upgrade.

 

Thanks for listening and if you haven't heard ubot is awesome!

 

Buddy

Link to post
Share on other sites

Okay I think I have found a solution that I can live with for awhile.

 

What I realized is that I needed to do the replacement on each URL string individually rather than attempting to do it via the write to file piece of my list generation.

 

So when I load my list items that is the point that I do the search & replace.

 

It's two extra steps but it does solve the problem.

 

I still think at the point of scraping the integrity of the URL string should be left intact. Maybe that can be incorporated into a future version.

 

Thanks everyone for the comments.

 

Buddy

ubot-009.JPG

Link to post
Share on other sites

Okay I think I have found a solution that I can live with for awhile.

 

What I realized is that I needed to do the replacement on each URL string individually rather than attempting to do it via the write to file piece of my list generation.

 

So when I load my list items that is the point that I do the search & replace.

 

It's two extra steps but it does solve the problem.

 

I still think at the point of scraping the integrity of the URL string should be left intact. Maybe that can be incorporated into a future version.

 

Thanks everyone for the comments.

 

Buddy

 

I think that is as good a work-around as any. I like your choice of variable names in the example :-)

 

Andy

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...