Jump to content
UBot Underground

Html Parser Auto Fill (Script)


Recommended Posts

Hey

 

Since I learned how to use this awesome thing I find myself using it more and more.So Ive shortened down the parameters a bit,you can now just paste the HTML string into the function,and it should build it for you automatically.Using example firebug or Ubots element selector,just paste in the html snippet you would have used for filling in the parameters into Aymans HTML Parser.

 

Ive also noticed when your building a bot,you are using $document text,and when you have the thing ready youve to go messing around with it to change it to http get etc,so document text is enabled by default,you can change it by filling in the document parameter with a variable/function

 

And it also uses regex to find what attribute you want to scrape,if none found it uses exactly what you typed,

 

maybe some bugs,I just wrote it but it looks working and useful

also in the rare occasion there is an element that would conflict with the regex finder,write your attribute surrounded by double quotes

 

sites that have a few attributes(this script just looks for the first attributes,and you want a specific one)

say ubots forum,and theres a classe,for the subsections,pasting below code would return all subsections

set(#ds,$htmlParseAuto("""<ol class=\"ipsList_inline ipsType_small subforums\" id=\"subforums_21\">""innerh"),"Global")

 

so remove the class from the string(same as aymans,how you would do it,tag,attribute,value),and you'l scrape the specific subsection

 

set(#ds,$htmlParseAuto("""<ol id=\"subforums_21\">""innerh"),"Global")

 

navigate("http://network.ubotstudio.com/forum/","Wait")

wait for browser event("DOM Ready","")
wait for browser event("Page Loaded","")
set(#ds,$htmlParseAuto("""<ol id=\"subforums_21\">""innerh"),"Global")

define $htmlParseAuto(#Document, #HTMLString, #Attribute To Scrape) {
    run javascript("function isJquery()\{
if( typeof jQuery === \"undefined\")\{
  
 var myJQScript = document.createElement(\"script\")
 myJQScript.src = \"https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js\"
 document.body.appendChild(myJQScript)
 return false
\}
  else\{
    return true
  \}
\}
")
    if($comparison(#Document,"=",$nothing)) {
        then {
            set(#Document,$document text,"Local")
        }
    }
    if($comparison($eval("isJquery()"),"= Equals","False")) {
        then {
            wait(0.5)
        }
    }
    run javascript("myHTMLParser=$.parseHTML(\'{#HTMLString}\')")
    clear list(%Attributes)
    if($comparison($find regular expression(#Attribute To Scrape,"^\".+?\"$"),">",$nothing)) {
        then {
            set(#Attribute,#Attribute To Scrape,"Local")
        }
        else {
            clear list(%Attributes)
            add list to list(%Attributes,$list from text("InnerText,
InnerHtml,
OuterHtml,
value,
id,
name,
src,
href,
type,
action,
title",","),"Delete","Local")
            set(#position,0,"Local")
            loop while($comparison(#position,"<",$list total(%Attributes))) {
                if($comparison($find regular expression($list item(%Attributes,#position),"(?i){#Attribute To Scrape}"),">",$nothing)) {
                    then {
                        set(#Attribute,$list item(%Attributes,#position),"Local")
                        set(#position,$list total(%Attributes),"Local")
                    }
                    else {
                        increment(#position)
                    }
                }
            }
        }
    }
    if($comparison(#Attribute,"=",$nothing)) {
        then {
            set(#Attribute,#Attribute To Scrape,"Local")
        }
    }
    wait(0.05)
    set(#Attribute,$replace regular expression(#Attribute,"\\s",""),"Local")
    return($plugin function("HTTP post.dll", "$html parser", #Document, $eval("myHTMLParser[0].nodeName.toLowerCase()"), $eval("myHTMLParser[0].attributes[0].name"), $eval("myHTMLParser[0].attributes[0].value"), #Attribute))
}

navigate("http://ubotstudio.com/index7","Wait")
wait for browser event("DOM Ready","")
wait for browser event("Page Loaded","")
set(#ds,$htmlParseAuto("", "<div class=\"tagline\">", "innert"),"Global")

Link to post
Share on other sites
  • 2 months later...
  • 2 weeks later...

Just saw your code, so many inspiration for me ! Could you upload .ubot file as forum attach files, forum filter some code, copy&paste not work on Ubot .

 

You're not getting the error when switching to node view because the code is corrupted. This script requires that you have the HTTP Plugin.

Link to post
Share on other sites

check out my plugin CSS parser

 

http://network.ubotstudio.com/forum/index.php/topic/18575-css3-selector-for-ubot-my-first-pluginxpath-alternative/

 

Not as good as aymans HTML/Xpath but it is worth free,I plan on doing some upgrades and fixes to it in the new year

 

**have no idea why this script above doesnt work anymore,it worked when I uploaded it**

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...