Jump to content
UBot Underground

scrape data with variable match


Recommended Posts

Hi 

 

i am testing ubot and facing one problem

 

I made a variable like #abc and value is set to "http://yoursite.com/abc"


If variable #abc set to http://yoursite.com/abc then bot scrap 3/470469

and if variable #abc set to https://www.abcdef.com then bot scrap 1/2017354

and so on

 

possible ?

Link to post
Share on other sites

Something like this?

if($comparison(#abc, "=", "http://yoursite.com/abc")) {
    then {
        set(#scraped_value, $find regular expression($document text, "(?<=//pause/).*?(?=\\\\)"), "Global")
    }
    else if($comparison(#abc, "=", "https://www.abcdef.com")) {
        set(#scraped_value, $plugin function("File Management.dll", "$Find Regex First", $document text, "(?<=my_sites/pause/).*?(?=\\\\)"), "Global")
    }
}

I used the $Find Regex First function from Aymen's File Management Plugin for the second one because there are two very similar tables in your html (it's just easier that way). Let me know if this helps.

 

Marton

  • Like 1
Link to post
Share on other sites

thx urban.marton for quick help. 

 

But what if we have lots of data and #abc variable fill by users ? means #abc variable can be anything.

Link to post
Share on other sites

thx urban.marton for quick help. 

 

But what if we have lots of data and #abc variable fill by users ? means #abc variable can be anything.

 

Never mind, I just got what you meant. I'll take a look at it for you later today.

Link to post
Share on other sites

That's what I could come up with:

set(#scraped_value, $trim($find regular expression($find regular expression($document text, "(?s)(?<=my_sites/delete/).*?(?=\\\\.*{#url})"), ".*\\Z")), "Global")

Works for all three URLs in your html. I also attached the code I was playing with (no plugins needed).

test.ubot

  • Like 1
Link to post
Share on other sites

That's what I could come up with:

set(#scraped_value, $trim($find regular expression($find regular expression($document text, "(?s)(?<=my_sites/delete/).*?(?=\\\\.*{#url})"), ".*\\Z")), "Global")

Works for all three URLs in your html. I also attached the code I was playing with (no plugins needed).

 

great its working fine but however its messed up with more data

 

but now i know how it will work

 

thx for help :-)

Link to post
Share on other sites

great its working fine but however its messed up with more data

 

but now i know how it will work

 

thx for help :-)

 

Hi, I'm glad! Are you using Ubot4 though? I had clean results with it (although I don't know why $trim was necessary, it shouldn't be needed, I guess it's a Ubot thing...)

 

Let me know if you need any changes.

Link to post
Share on other sites

Hi, I'm glad! Are you using Ubot4 though? I had clean results with it (although I don't know why $trim was necessary, it shouldn't be needed, I guess it's a Ubot thing...)

 

Let me know if you need any changes.

 

i am using ubot 5

 

ubot works like a charm if you know logic :-)

Link to post
Share on other sites

i am using ubot 5

 

ubot works like a charm if you know logic :-)

 

I've been testing the code in Ubot 4, that's why I was asking. It gave me clean results in Ubot4, Ubot 5 still does some strange stuff, I wouldn't go even near it yet.

Link to post
Share on other sites

Hey-hey, I just had a similar problem today, and I realized there is a much-much simpler solution for this problem. I attached the updated code for you. It works with full URLs, but even part URLs as you can see in this short video:

 

http://www.screencast.com/t/MIfA3gg6J3

 

Hope it helps,

Marton

test_updated.ubot

  • Like 1
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...