Jump to content
UBot Underground

Recommended Posts

Hello Guys,

 

I did some testing with Ubot and Threads recently. And I would like to give you a short update about my experience.

 

 

General Information:

If you want to use threads within Ubot you need a way to control the amount of threads you start simultaneously. Otherwise ubot will launch as much threads as possible until it crashes.

If you launch a lot of threads with a loop for example.

 

So normally you will have some wait code like (incomplete code snippet to make a point!):

loop(50) {

  loop while($comparison(#threadcount">=", 50)) {
        wait(0.1)
    }

launchmoreThreads()

}

 

 

Now there are three ways to count the amount of threads at the moment

1. Built in stuff like variables

2. Free Plugin (Threads  Counter Plugin) LINK

3. Payed Plugin (Advanced Ubot)  LINK

 

 

1. Use a global variable to count threads

 

This is not working correctly!

I have seen two major issues with it.

 

1. Threads are slowly going down. If you run a loop with 5000 repetitions. You will have the full amount of threads at the beginning. But very quickly it will go down and only run 1-5 threads in parallel. Even going down to only 1 thread. 

 

2. The global variable is not counting correctly. If you run 1000 loops with 1000 x increase and 1000 x decrease the variable should be 0 at the end. But that's not the case.

 

 

Status V4.2.20:  Error is happening and can be reproduced with the attached test script. 

Status V5.0.10 : Still happening. The threads decrease starts later than in V4. But after 2000 loops it will also go down to a single thread.

 

Most of the time the amount of loop is correct. So when I loop 5000 times, I have 5000 rows in my table. That's working fine 99% of the time. 

But it also happened that I only had 4998 rows. 

 

But every time the threads counter variable is messed up. Even in those cases where the row number is correct. 

 

 

Conclusion:

Using a global variable to control your threads might work fine for a low number of threads. But for higher numbers (scraping bots) you will see  those errors.

For scraping it might not matter if there are 3 links missing when you scrape 10000 links. But for account signup and stuff like that, it might get messy.

 

I think this is related to the problem that global variables and lists are not completely thread safe.

 

 

Related Tracker / Support Item:

http://tracker.ubotstudio.com/issues/379

 

 

2. Free Plugin Threads Counter

 

Status V4.2.20:  Works perfectly fine for me. Haven't had a single issue or wrong counting. Never missed a row in the table. 

Status V5.0.10 : The plugin works perfectly fine. But from 50 test runs I had a complete ubot crash 8 times. With some .net error. Not sure if that is a general threads issue in v5 or somehow related to the plugin. And I only did my tests in UBot Studio. So it might be different when you compile the bot. 

 

Related Tracker / Support Item:

http://tracker.ubotstudio.com/issues/368#change-1194

 

The support team denied to troubleshoot an issue when I use the plugin. But I modified the code so that the crash happens without the plugin is well. 

Just launching enough threads at the same time can crash ubot. But also v4 is freezing when you do that. So not sure if they will look into that at all...

 

 

3. Payed Plugin (Advanced Ubot)

 

Status V4.2.20:  Works perfectly fine for me. Haven't had a single issue or wrong counting. Never missed a row in the table. 

Status V5.0.10 : This feature is not working at all in V5. I sent a summary to Pesh and he probably will fix that quickly.

 

 

 

General Conclusion:

 

If you need multithreading I highly recommend you stick to Version 4.2.20 at the moment. 

And you definitely should use one of the plugins to count and control the amount of threads running.

 

The free plugin works perfectly fine. If you own Advanced Ubot you can also use that one, just to reduce the amount of plugins in your bot and reduce the size of the compiled bot.

 

 

 

Attached is a sample .ubot file where you can test the three different scenarios on your own. 

If someone finds and error or knows how to improve something, please let me know. 

 

 

Hope this is helpful for some of you.

 

Kindest regards

Dan

Threads and Add Items (test).ubot

  • Like 5
Link to post
Share on other sites

There are actually more ways to count, since you can also use tables and just use count function to get number as discussed in this thread: http://www.ubotstudio.com/forum/index.php?/topic/15122-must-read-threading-doesnt-work-as-expected-tested-in-v4 (it didn't work with lists)

 

Also about the issue, it was already reported a while ago here but UBot team closed it by saying they can't reproduce it. Really sad...

Link to post
Share on other sites

There are actually more ways to count, since you can also use tables and just use count function to get number as discussed in this thread: http://www.ubotstudio.com/forum/index.php?/topic/15122-must-read-threading-doesnt-work-as-expected-tested-in-v4 (it didn't work with lists)

 

Also about the issue, it was already reported a while ago here but UBot team closed it by saying they can't reproduce it. Really sad...

 

I still hope that they understand the problem and will fix it within ubot. Because there is also a problem with global lists as discussed in another thread. 

And I don't understand why they say that they can't reproduce it. :-(

 

Could you explain a little bit how you made your counter thread safe? Is there a special c# function you used to do that? Just on a very high level, how can that issue be solved?

I would love to get a basic understanding of the solution. If you don't mind sharing that?

 

Dan

Link to post
Share on other sites

Thanks for the kind words. But I did not invent it or figure it out. Honor should go to guys like UbotDev. 

 

I just summarized some stuff and brought it in front of the community again. So hopefully the dev team will pick it up and finally fix that threading stuff.

 

Dan

Link to post
Share on other sites

Yes, as I've mentioned on my blog I use atomic operations instead of regular ones (.NET class is called Interlocked if you are interested into that).

Link to post
Share on other sites

Yes, as I've mentioned on my blog I use atomic operations instead of regular ones (.NET class is called Interlocked if you are interested into that).

Cool! Thanks a lot for that!

 

Dan

Link to post
Share on other sites

Great summary, thanks Dan! Looks like Pash has already fixed the error with V5 thread control.

 

I have a question. Do you guys feel it would be useful to have more variables that we could increment/decrement with atomic operations? I have some complex bots where I not only increment/decrement the threads, but also need to change some global variables in every thread (incrementing/decrementing would be enough). If there is enough demand, we could ask Pash (or UbotDev if you're up for it) to add a few more (maybe max three) variables like that. Or is it just me?

 

(obviously the solution would be for Ubot to use variables like that, but I gave up on waiting for them to fix all this)

Link to post
Share on other sites

also dan for future reference regarding c# you can use locks and mutex, for locking segments of code. I have a mutex class that will also timeout thus stopping issues were the code inside the mutex fails and you whole bot locks up because the mutex(lock) hasn't been released.

 

if your doing c# and would like a copy let me know.

Link to post
Share on other sites

Dan you really went to town on your homework there, well done, great info. I gave up using threads, and instead use multiple copies,

Dito

 

I make more bot this way. When I get around to threading them I do this ^^^

 

But, I realized long ago that it would be quicker to learn C# then wait for the "Dev Team" to get anything done about this or anything else.

 

Thanks Dan for the good work....keep up the fight. I will send you a C# course that someone shared in HTTP Skype Group

 

TC

Link to post
Share on other sites

Great summary, thanks Dan! Looks like Pash has already fixed the error with V5 thread control.

 

I have a question. Do you guys feel it would be useful to have more variables that we could increment/decrement with atomic operations? I have some complex bots where I not only increment/decrement the threads, but also need to change some global variables in every thread (incrementing/decrementing would be enough). If there is enough demand, we could ask Pash (or UbotDev if you're up for it) to add a few more (maybe max three) variables like that. Or is it just me?

 

(obviously the solution would be for Ubot to use variables like that, but I gave up on waiting for them to fix all this)

I was already thinking to add more counters, but I don't think they are required. The threads counter is the only variable that really needs the value to be set from threads, there is no way around it, but for others you can pass value back to main/UI thread and increment/decrement it there. Do you think we really need more?

Link to post
Share on other sites

I was already thinking to add more counters, but I don't think they are required. The threads counter is the only variable that really needs the value to be set from threads, there is no way around it, but for others you can pass value back to main/UI thread and increment/decrement it there. Do you think we really need more?

 

I'm trying to think of a simple example, I hope it's a good one. We have a simple multi-threaded bot that navigates to one URL after the other. We have two lists:

1. List of URLs the bot navigates to

2. List of proxies the bot uses (but also checks before usage)

 

Basically this is what the bot would look like:

 

1. Like you said we simply increment the row number (global variable) in the main thread, and pass the value to the define with the thread inside.

2. But before the bot navigates to the URL with the given row number, there is a proxy checker inside the thread (a loop until it finds a good proxy). So every time a bad proxy is found we need to increment the row number (global variable) for this too (each proxy check can take 5-30 seconds).

3. When the bot finally finds a good proxy, it navigates to the URL from the list (row number passed from outside the thread - OK) using the proxy from the list (row number passed from inside the thread - CAN CAUSE PROBLEMS).

 

What do you guys think?

Link to post
Share on other sites

If the previous example is not enough, here's another one:

 

Same bot as before (navigating to URLs), but we don't want the bot to close the threads (opening and closing the threads take time, and we don't want that). We wan't the threads to stay open and work in a loop, so we'll have to increment the row number for the URLs to visit from within the thread (and we want to bot to visit the URLs in order). This is needed, because we won't be able to reach the maximum number of open threads if we keep closing each thread after the visit (checking for the number of open threads also takes time, even with 0.1 or 0 delay).

Link to post
Share on other sites

One more, the ultimate example:
 

The bot operation doesn't even matter here. We want to keep track of how many threads actually finish the given operation (if a thread crashes before finishing, we want to know that). For that we'll have to have a variable that gets incremented every time a thread finishes successfully (so it finishes even the last command in the thread). So we'll have to increment our variable (of successful thread operation) at the end of the thread (after or before the decrement threadcount command).

 

Nevertheless, there are many complex bots where you won't be able to change all the variables in the main thread. That's why I thought it would be nice to have say 3 more variables that could be incremented/decremented using atomic operations.

 

Let me know what you guys think.

 

Thanks,

Marton

Link to post
Share on other sites

I'm trying to think of a simple example, I hope it's a good one. We have a simple multi-threaded bot that navigates to one URL after the other. We have two lists:

1. List of URLs the bot navigates to

2. List of proxies the bot uses (but also checks before usage)

 

Basically this is what the bot would look like:

 

1. Like you said we simply increment the row number (global variable) in the main thread, and pass the value to the define with the thread inside.

2. But before the bot navigates to the URL with the given row number, there is a proxy checker inside the thread (a loop until it finds a good proxy). So every time a bad proxy is found we need to increment the row number (global variable) for this too (each proxy check can take 5-30 seconds).

3. When the bot finally finds a good proxy, it navigates to the URL from the list (row number passed from outside the thread - OK) using the proxy from the list (row number passed from inside the thread - CAN CAUSE PROBLEMS).

 

What do you guys think?

I would send the proxy list as variable to every thread. And within the thread I would figure out a good one. 

With random list item you can mix it up pretty easily. 

 

Dan

Link to post
Share on other sites

If the previous example is not enough, here's another one:

 

Same bot as before (navigating to URLs), but we don't want the bot to close the threads (opening and closing the threads take time, and we don't want that). We wan't the threads to stay open and work in a loop, so we'll have to increment the row number for the URLs to visit from within the thread (and we want to bot to visit the URLs in order). This is needed, because we won't be able to reach the maximum number of open threads if we keep closing each thread after the visit (checking for the number of open threads also takes time, even with 0.1 or 0 delay).

If you want to keep 10 threads running and process 100 urls. 

Send 10 urls to each thread and loop through those 10 urls within the thread. 

 

Dan

Link to post
Share on other sites

One more, the ultimate example:

 

The bot operation doesn't even matter here. We want to keep track of how many threads actually finish the given operation (if a thread crashes before finishing, we want to know that). For that we'll have to have a variable that gets incremented every time a thread finishes successfully (so it finishes even the last command in the thread). So we'll have to increment our variable (of successful thread operation) at the end of the thread (after or before the decrement threadcount command).

 

Nevertheless, there are many complex bots where you won't be able to change all the variables in the main thread. That's why I thought it would be nice to have say 3 more variables that could be incremented/decremented using atomic operations.

 

Let me know what you guys think.

 

Thanks,

Marton

 

Add your list to a table. 

 

Loop table max rows.

Send the row possition to your thread.

And write to the global table. 

 

Each thread will process a different row. And each process can write to a second column and save the stats. OK, Error and so on.

 

Dan

Link to post
Share on other sites

I would send the proxy list as variable to every thread. And within the thread I would figure out a good one. 

With random list item you can mix it up pretty easily. 

 

Dan

Wouldn't work if we want the proxies to be processed&checked in order (check current proxy, and use it if it's good... if not, check the next one in the list and so on). So whichever thread finishes checking the current proxy, we want it to process the next one from the global list.

 

If you want to keep 10 threads running and process 100 urls. 

Send 10 urls to each thread and loop through those 10 urls within the thread. 

 

Dan

I specified: "we want to bot to visit the URLs in order" 

That wouldn't work with your method (I thought of that too, but it's not good enough). The URLs wouldn't necessarily be visited in order, since the loading time is not constant (example: thread 8 could finish sooner than thread 2, even if you add some delay between starting the threads).

 

Add your list to a table. 

 

Loop table max rows.

Send the row possition to your thread.

And write to the global table. 

 

Each thread will process a different row. And each process can write to a second column and save the stats. OK, Error and so on.

 

Dan

Table operations are slower than variable operations, they increase memory usage and increase the chance of the bot crashing (especially with larger tables).

Think of it this way: we want a simple stat monitor that starts from 0, and only increases when a thread successfully finishes. It would be nice to be able to do that with a simple increment.

 

 

 

Thanks for thinking these through,

Marton

Link to post
Share on other sites
  • 9 months later...

Hello Guys,

 

I did some testing with Ubot and Threads recently. And I would like to give you a short update about my experience....

...

...

 

1. Use a global variable to count threads

 

This is not working correctly!

I have seen two major issues with it.

 

1. Threads are slowly going down. If you run a loop with 5000 repetitions. You will have the full amount of threads at the beginning. But very quickly it will go down and only run 1-5 threads in parallel. Even going down to only 1 thread. 

 

2. The global variable is not counting correctly. If you run 1000 loops with 1000 x increase and 1000 x decrease the variable should be 0 at the end. But that's not the case.

 

 

Status V4.2.20:  Error is happening and can be reproduced with the attached test script. 

Status V5.0.10 : Still happening. The threads decrease starts later than in V4. But after 2000 loops it will also go down to a single thread.

 

..

..

 

I thought I would revisit this old thread and run the first test. Every time I run I get 5000 in the table. All seems to work okay.

I increased the loop to 50k ... all runs fine too.

 

Is threading working okay now and I missed the announcement or am I missing something?

Edited by Pete_UK
Link to post
Share on other sites

The counting issue has been fixed Pete, but there are still other limitations.

Mainly when you use plugin commands / functions within threads.

 

But you have to test it for your scenario. A lot of stuff is working ok.

 

Dan

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...