Jump to content
UBot Underground

Recommended Posts

Hi,

 

So, I've figured out how to exclude a bunch of characters and junk I don't want in my output using $replace regular expression. The issue is that I'm replacing a lot of junk code with a ","  When I used replace it with nothing it mushed all the data together with no commas at all and that defeated my add to list. Now my data looks like this (sticking with the replace junk with comma idea):

 

,,results,,,,,query,,,baby,girl,clothes,,,search_types,,,,h

 

Every time I run the bot it will have different data but the commas will always be the same.

 

So I am attempting to accomplishing two things:

 

1) Come up with a way to trim the extra commas out leaving only one comma after each word

2) Every time I run the bot it gives me the words "results, query, and search_types" over and over and I wish to remove them completely and leave all the other data.

 

So I'm wondering with the second item if there is a way or specific generic code in Regex that you can use that says basically "Hey, when you see this word, exclude it and move on. Do that every time you see this word in the results as it repeats."

 

I'm thinking this has to do with the Regex look-ahead command but at that point, I'm a noob to Regex and am stuck. I've used Rubular.com and Regexone.com and I still can't figure it out. :(

 

Any help here would be greatly appreciated

 

Thank you!

Link to post
Share on other sites

I think this is what you want..

set(#data, "This is example text I want to search and replace real cool like a champ.", "Global")
set(#replace words, "text|search|cool", "Global")
alert($replace regular expression(#data, #replace words, $new line))
comment("a fuction example")
define $Replace with New Line(#input text, #New deliniter, #Items to replace) {
    return($replace regular expression(#input text, #Items to replace, #New deliniter))
}
alert($Replace with New Line(#data, $new line, #replace words))
alert($Replace with New Line(#data, $new line, $text from list($list from text("text
search
cool", $new line), "|")))
comment("you can incorporate the \"items to replace:
into the function so it can just take the list.
so all you would need to supply is a list")

If not then should help you in the right direction.

 

CD

Link to post
Share on other sites

Hi CD,

 

Thank you! I'm working with the code you provided above and getting very close. You're reply is very much appreciated and has me going in the right direction.

 

Thank you again!

Link to post
Share on other sites

Thanks!

 

So when I use the GET request the data comes in like this:

 

{"results":[{"query":"baby girl clothes","search_types":["handmade","category_tags_vintage"],"search_type_names":["in Handmade<\/b>","in Vintage<\/b>"]},{"query":"baby clothes","search_types":["handmade","category_tags_vintage"],"search_type_names":["in Handmade<\/b>","inVintage<\/b>"]},{"query":"baby boy clothes","search_types":["handmade","category_tags_vintage"],"search_type_names":["in Handmade<\/b>","inVintage<\/b>"]},{"query":"american girl doll clothes","search_types":["handmade"],"search_type_names":["in Handmade<\/b>"]},{"query":"clothes","search_types":["handmade"],"search_type_names":["in Handmade<\/b>"]},{"query":"dog clothes","search_types":["handmade"],"search_type_names":["in Handmade<\/b>"]},{"query":"hippie clothes","search_types":["handmade"],"search_type_names":["inHandmade<\/b>"]},{"query":"doll clothes","search_types":["handmade","category_tags_vintage"],"search_type_names":["in Handmade<\/b>","inVintage<\/b>"]},{"query":"barbie clothes","search_types":["handmade","category_tags_vintage"],"search_type_names":["in Handmade<\/b>","inVintage<\/b>"]},{"query":"kids clothes","search_types":["handmade","category_tags_vintage"],"search_type_names":["in Handmade<\/b>","inVintage<\/b>"]},{"query":"workout clothes","search_types":["handmade"],"search_type_names":["in Handmade<\/b>"]},{"query":"find shop names containing <\/span>clothes<\/span>","link":"\/search\/shops?search_query=clothes","search_types":[],"search_type_names":[]}],"count":12,"experiment":"off"}

 

So, that's when I used Regex and used "\W" and replaced the misc characters with $nothing. That gets me 95% of the way  :) .  It's pretty cool, Regex is cool!

 

Then I'm left with too many commas and some non-keyword words I want to remove (e.g., results, query, category_tags_vintage, stand alone letter "b"):

 

,,results,,,,,query,,,baby,girl,clothes,,,search_types,,,,handmade,,,category_tags_vintage,,,,search_type_names,,,,in,,b,Handmade,,,b,,,,in,,b,Vintage,,,b,,,,,,,query,,,baby,clothes,,,search_types,,,,handmade,,,category_tags_vintage,,,,search_type_names,,,,in,,b,Handmade,,,b,,,,in,,b,Vintage,,,b,,,,,,,query,,,baby,boy,clothes,,,search_types,,,,handmade,,,category_tags_vintage,,,,search_type_names,,,,in,,b,Handmade,,,b,,,,in,,b,Vintage,,,b,,,,,,,query,,,american,girl,doll,clothes,,,search_types,,,,handmade,,,,search_type_names,,,,in,,b,Handmade,,,b,,,,,,,query,,,clothes,,,search_types,,,,handmade,,,,search_type_names,,,,in,,b,Handmade,,,b,,,,,,,query,,,dog,clothes,,,search_types,,,,handmade,,,,search_type_names,,,,in,,b,Handmade,,,b,,,,,,,query,,,hippie,clothes,,,search_  etc...

 

So that's where I left off.  The keywords will always change but the "junk" will be constant from the GET so I have to figure out the best possible filter. Maybe my initial Regex "\W" could have been more elaborate? That's where my lack of knowledge comes in.

 

Thanks!

Link to post
Share on other sites

I think this is what you want...

 

 

set(#json"\{\"results\":[\{\"query\":\"baby girl clothes\",\"search_types\":[\"handmade\",\"category_tags_vintage\"],\"search_type_names\":[\"in Handmade<\\/b>\",\"in Vintage<\\/b>\"]\},\{\"query\":\"baby clothes\",\"search_types\":[\"handmade\",\"category_tags_vintage\"],\"search_type_names\":[\"in Handmade<\\/b>\",\"inVintage<\\/b>\"]\},\{\"query\":\"baby boy clothes\",\"search_types\":[\"handmade\",\"category_tags_vintage\"],\"search_type_names\":[\"in Handmade<\\/b>\",\"inVintage<\\/b>\"]\},\{\"query\":\"american girl doll clothes\",\"search_types\":[\"handmade\"],\"search_type_names\":[\"in Handmade<\\/b>\"]\},\{\"query\":\"clothes\",\"search_types\":[\"handmade\"],\"search_type_names\":[\"in Handmade<\\/b>\"]\},\{\"query\":\"dog clothes\",\"search_types\":[\"handmade\"],\"search_type_names\":[\"in Handmade<\\/b>\"]\},\{\"query\":\"hippie clothes\",\"search_types\":[\"handmade\"],\"search_type_names\":[\"inHandmade<\\/b>\"]\},\{\"query\":\"doll clothes\",\"search_types\":[\"handmade\",\"category_tags_vintage\"],\"search_type_names\":[\"in Handmade<\\/b>\",\"inVintage<\\/b>\"]\},\{\"query\":\"barbie clothes\",\"search_types\":[\"handmade\",\"category_tags_vintage\"],\"search_type_names\":[\"in Handmade<\\/b>\",\"inVintage<\\/b>\"]\},\{\"query\":\"kids clothes\",\"search_types\":[\"handmade\",\"category_tags_vintage\"],\"search_type_names\":[\"in Handmade<\\/b>\",\"inVintage<\\/b>\"]\},\{\"query\":\"workout clothes\",\"search_types\":[\"handmade\"],\"search_type_names\":[\"in Handmade<\\/b>\"]\},\{\"query\":\"find shop names containing <\\/span>clothes<\\/span>\",\"link\":\"\\/search\\/shops?search_query=clothes\",\"search_types\":[],\"search_type_names\":[]\}],\"count\":12,\"experiment\":\"off\"\} ""Global")
set(#find$find regular expression(#json"(?<=query\":\").*?(?=\",\")"), "Global")

 

 

This is a JSON response from your get so ideally you want an JSON parser.

 

there is a JSON plugin that is free to use but is not reliable.

 

Alternatively, you can use ubot 5's python and use that or use the above regex.

 

not sure if you want the last line but I think you can handle that.

 

CD

Link to post
Share on other sites

Okay,

 

So almost there. Sorry to keep being a pest but anyone struggling with Regex is going to learn a lot from this thread.  ^_^

 

So here's my code:

 

plugin command("SocketCommands.dll""socket container") {
    plugin command("HTTP post.dll""http max redirects", 15)
    set(#get,$plugin function("HTTP post.dll""$http get""https://etsy.com/suggestions_ajax.php?extras=\{\"autosuggest_expt\":\"off\",\"autosuggest_lang\":\"en-US\"\}&version=10_12672349415_2&search_query=clothes&search_type=all""Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0""etsy.com"""""),"Global")
}
set(#strip,$find regular expression(#get,"(?<=query\\\":\\\").*?(?=\\\",\\\")"),"Global")
load html(#strip)

 

And I used your Regex (which does the job 99.9%)

 

The output is this:

 

baby girl clothes

baby clothes

baby boy clothes

etc...and then

<\/span>clothes<\/span>

 

The search terms will always be different so is there a way in Regex formula to take out that last <\/span> and <\/span>?

 

Thanks Code Docta!

  • Like 1
Link to post
Share on other sites

tada!

 

set(#strip$replace regular expression($find regular expression(#get"(?<=query\\\":\\\").*?(?=\\\",\\\")"), "<\\\\/span>.*<\\\\/span>"$nothing), "Global")
alert($replace regular expression("<\\/span>clothes<\\/span>""<\\\\/span>.*<\\\\/span>"$nothing))

 

here is what I use to test in

 

http://regexhero.net/tester/

 

my regex bible

 

(?=ABC)      - Positive lookahead. Matches a group after your main expression without including it in the result.
 
(?!ABC)      - Negative lookahead. Specifies a group that can not match after your main expression (ie. if it matches, the result is discarded).
 
(?<=ABC)     - Positive lookbehind. Matches a group before your main expression without including it in the result.
 
(?<!ABC)     - Negative lookbehind. Specifies a group that can not match before your main expression (ie. if it matches, the result is discarded).
 
(?<=ABC).*?(?=ABC) - Extracts the text between specified groups.

 

was found on forum long ago, cant remember from who

 

Helloinsomnia has some great regex stuffs too

 

CD

  • Like 1
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...