Today’s guest post is by a UBot user who has been willing to help out new customers for years, and who is a fantastic bot builder in his own right: Buddy. Buddy helped us with the initial ideas for a recent update to UBot Studio, an expansion of our Table Commands that lets you do far more interesting data manipulation, which is great for working with large amounts of data and complex scraping.
Buddy’s teaches in-depth UBot lessons. His website is here.
Let’s Talk About Tables!
I was thrilled when the “scrape table” node was introduced way back in the early days of UBot version 4.
But it did not take long to realize that more was needed. So being the ever diligent UBotter that I am, I started a thread on the forum and listed several other Table related nodes that I thought might be useful.
After some time and patience the nodes have been added.
So here is the breakdown (since I was requesting these and I will be sharing some code here for you).
The “scrape table” node is quite powerful by itself. Once you have it, scraping the data it is quite the workhorse.
But if you have a website where the table is spread out over several web pages it then becomes a coding pain to transfer that data to a Master Table:
add table to table
The “add table to table” node simplifies this process.
By using the “scrape table” and the “add table to table” node together means the only coding that will be necessary will be refreshing the website’s page for clicking a Next button.
In my bot, you will see my website get refreshed 5 times and the data will be scraped and then added to my Master Table.
Clean and neat. Also note, that I Clear the work table to keep memory clean
delete from table
This is a very simple and clean node to use.
In some of the Tables that I have had to scrape. I have encountered some tables that had excessive data in either Rows or Columns that I just dd not need.
This node “Delete from Table” now gives us the ability to delete entire rows or columns depending on what our pruning needs are.
Understand, that you will delete one row or one column at a time. To do more you would have to add some logic to properly target which ones you wanted to delete. I like that part; just one at a time.
I also like the fact that it is zero-based indexing so if I see in the debugging window that I have two columns to delete then I will target the higher number say “5” and then I will target “3”. Easy and simple. I like this one.
insert into table
Next up is the “Insert into Table” node.
Here is how I use this.
Maybe you have seen the CSV files that Google produces from the Adwords Keyword Tool but the headers that are produced might not be to your liking.
After all, we want out customer to understand what they are looking at.
So I generally delete the header row and then I insert a new one and then I use the “Set Table Cell” node to add my column header titles.
Yes, I could have not used the “Insert into Table” node but then I would not have gotten to use it. LOL
My point in this example is IF you have a table where by you needed to split out you culd add column headers at specific data points so that you could prune off the sections of the table for other independent tables.
Now comes the “Sort Table” node.
Remember, before this we had nothing that did this.
Now we can sort! Yipee!
Granted, it is Ascending so the “A”s and the “0”s will be at the top when it is done. Hopefully, reversing that will be added at some point if many people want it.
But for now I am happy with this as I only do ascending sorts.
Now we can Search a Table.
You will find this node in the Parameters section under Data Functions.
Here’s the trick with this command. If you know the exact spelling of what you are searching for then this function will return the Index of the Row or Column (depending on what you selected).
So if you have a rather large table then you could search for a specific name ad process just that record.
Or maybe, you need to delete that row. Using this node along with “Delete from Table” you now have two commands where as in the prior versions you wuld have to code a lot of logic just to accomplish the same feat.
This will save a lot of time!
$list from table
I saved the best for last.
This has got to be my favorite node. I no longer have to setup a loop for grabbing a column of table data one at a time to load into a List.
This node does it all in one shot and it eliminates considerable coding.
I can now build a list quickly either by Column or Row but I can tell you now that the Column will be used the most.
And the REALLY cool thing is that this node can be dropped right into a “add list to list” node and my list wil be created that fast.
So to recap, if I have 5 thumbs they would all be up for these newest enhancements to the Table commands.
If you have been frustrated with processing tables then there is a table node in here that can surely help you. Plus, it will save you a ton of coding time.
Thanks UBot Devs! Great Job!