Jump to content
UBot Underground

XML - JSON - XPATH - REGEX


Recommended Posts

Hello Ubotters,

 

I currently try to get my head around the different parsing options. 

There are a lot of great plugins available for XML and JSON from Aymen. 

 

Or the nice regex builder from HelloInsomnia.

 

Aymens socket code generator also has an Xpath and regex builder integrated.

 

 

Question:

What are your usecases for the different options?
When do you use what?
What are the advantages / disadvantages of the different methods?

 

 

I think this is a topic that is confusing for a lot of people. I have a very basic understanding about it, but I think there are some more experienced people in the forum who can explain that a lot better than I can. 

 

So if you are one of those.. please share your knowledge :-)

 

Dan

Link to post
Share on other sites

XML and JSON can be used when you encounter those data formats. For example take Reddit here:

 

http://www.reddit.com/r/listentothis.json

 

You can add .json to any reddit page to get the data available in that format. Or on this Fiverr example: http://www.fiverr.com/gigs/endless_page_as_json?host=category&type=endless_auto&category_id=2&limit=48&filter=auto&page=1

 

For XML a lot of API's and other things will let you add &format=xml to get it into an easily readable format for some applications (.Net <3 XML)

 

Like check out this Google autocomplete query: http://google.com/complete/search?client=hp&xml=true&q=hello

 

It uses &xml=true but close enough :)

But most of the time you will deal with HTML and that is where xpath of course comes in. You should use that as much as possible with regex to back it up. I like to use xpath to isolate things and then to drill down further sometimes you need to use regex.

If you don't use the HTTP post plugin you won't be using xpath with Ubot and then you are basically stuck with regex :(

Link to post
Share on other sites

To start the conversation, here is how I understand it:

 

XML and JSON are both way to deliver data in a structured way. Mainly used between websites and backend servers. But can be used for other stuff as well:

http://en.wikipedia.org/wiki/JSON

http://en.wikipedia.org/wiki/XML

 

There is a lot of discussion which one is better:

http://www.nczonline.net/blog/2008/01/09/is-json-better-than-xml/

 

 

 

Ok, here are the ubot options I'm aware of:

 

1.

XML plugin by Amen. 

http://www.ubotstudio.com/forum/index.php?/topic/13088-ubot-xml-plugin-ubot-discount/

 

Not sure how that XML plugin compares to XPATH in general? can it do the same? 

 

2.

XPATH is a query language to get data from XML.

 

Aymens HTTP Post Plugin has an XPATH parser integrated:

http://www.ubotstudio.com/forum/index.php?/topic/12837-sell-http-post-plugin-crazy-bonuses-inside/?hl=http%20post

 

His Socket Code Generator has an XPATH and Regex builder:

http://www.ubotstudio.com/forum/index.php?/topic/16057-sell-%E2%96%BA%E2%96%BAsocket-code-generator%E2%97%84%E2%97%84-%E2%9C%94-http-source-codes-generator-for-ubot/

 

3.

If you want to parse JSON in a similar way (like XPATH for XML) you can use:

The JSON Plugin from Aymen 

http://www.ubotstudio.com/forum/index.php?/topic/16166-free-plugin-jsonpath-parser-plugin/?hl=xpath

 

 

4. Regex

This is not specific to XML or JSON. It's more a method to define search patterns. 

And it can be used to search patterns in any kind of text. 

 

So it can be used to get data from normal websites, but also from XML and JSON. 

But it's not really specialized for it. So most of the time it's easier to use one of the specialized methods if you have to deal with XML or JSON.

 

 

 

Ok guys, that's my limited view to this world :-)

If there is something wrong or missing, please feel free to jump in and help out.

 

Thanks in advance for your help

Dan

  • Like 1
Link to post
Share on other sites

Xml,Json,Rest and Soap i mostly use for communicating with the different api's.

It is also a bit about what you find the easiest to use and where you are going to use it and what formats the api's support,for the amazon api i when't with Soap instead of the other options just because i found it better for that api.

Xml is something i use when exporting from database to certain websites where you can bulk upload products etc (google shopping and other price comparison sites)

Json i also use on some api's and i used it in a combination with regex and http before ubot had a json parser.

 

My typical usage would be:

Xpath (because it is easy )

CssPath (also very easy )

Regex (One of my favorites to get jobs done,very powerful and tons of option and you can get real advanced with it)

 

So for me i do not favor any of them they all have their place and it really depends on what you are doing and what is working on that project.

Because i am lazy and want to get the job done fast i mostly try with xpath and csspath first before i start using regex,

Link to post
Share on other sites

if there is a known parser for a particular output format then by all means use it , people created it is to make it easy to parse those output formats so you can achieve faster results as they are very easy to use !

You can use regex for everything ,after all nearly all parsers available are based on regex !

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...