jewcat
-
Content Count
105 -
Joined
-
Last visited
Posts posted by jewcat
-
-
ya, i'd just $replace all $newline with $nothing, should do the trick. or at least get you close to your aim.
-
I agree that the UI could be far more flexible, and one of the areas that ubot is lacking most is in options for compiled bots (ex. i would love to be able to set the onload url, so the bot opens to a webpage instead of a white blank upon opening), but in the long run I'd say that UI flexibility is an improvement that can wait a good while yet. There's lots of long needed improvements still on the waiting list to improve the actual functionality of ubot that are far more key than making compiled bots look prettier for the end user.
-
i haven't used it yet, but i was under the impression that ubot let you tap directly into your own decaptcher account. so if that's the case i'd assume you're only going to pay the $2/1000. Hopefully that's the case, need to implement decaptcher into the current bot i'm working with too...
-
Any news regarding the new beta release? What's the status at this time?
-
bumping this one, this is getting to be a recurring issue when i'm looking at automating processes on a variety of sites. how can i work with this?
-
Was going to continue this thread http://ubotstudio.com/forum/index.php?/topic/1943-seeking-help-with-squidoo-lens-making-bot/page__hl__squidoo__fromsearch__1 but it seems no one had anything to say and it doesn't sound like the OP ever made any progress.
The issue I'm seeing is that when you click an edit button you get this flashy dHTML lightbox sort of popup window. I encountered the same thing a few times, and it never seems to let me select any elements within these popups.
Is there a way to handle these? Same sort of popups boxes you see in Facebook, and Google Buzz...
-
I'm using a piece of php on my server. The bot inputs a directory name in a text input, and a list of scraped image URLs into another text input. The files are then loaded directly to my server for working with. had it hacked together by a coder on DP. Only issue is that it seems to drop about 20% of the images without fully downloading them.
Still can't wait for an actual Save As feature to come along.
-
Just curious what all captchas are supported by decaptcher. Do they do ANY image captcha?
-
i clear and reload personally.
-
Gonna drop a noob question here, lol. What do you mean by "hooks that are removed in compiled bots"?
-
having an issue sort of like this with a client's bot. everything works fantastic locally, yet refused to clicksome links running ona clinets machine, not sure why yet.
-
i just did
select by attribute -> innertext
click chosen
and it clicked fine here. Don't know what to say.
-
Use the Document Constant $meta keywords, you can also scrape meta description...
-
navigate away to another page, then retry. usually solves the problem for me.
-
Here, give this a try...
-
problem solved. selecting a table element or div by position has worked in most cases.
-
having a hell of a time trying to load the article text in a file here on this particular article site. i have tried $page scrape $scrape chosen attribute. tried loading it into a variable, tried loading it into a list. Just can't seem to pull the data from here. Any suggestions?
Demo article url: http://www.articledashboard.com/Article/Celebrity-Gossip--Celeb-Relationships-/1415867
I tried $page scraping between the <p></p>, which I thought would be the best option, but than I discovered it was scraping a different element elsewhere on the page. So I tried stepping up and scraping between the <td></td> and it returns a blank value.
Any suggestions?
-
ya, definitely going to be multiple saves along the way from this point forward.
-
I have a funny feeling I am not going to like the answer I get from this thread, but is there any way to recover a corrupted bot source? I have been working on a project for someone and just had a freeze up on my computer. I saved the bot I was working on, rebooted the computer, and came back to ubot to get things completed. Every time I attempt to open the file, ubot crashes. I can open my other saved files just fine.
Is this file lost forever? Nothing popping up for the emergency backup feature, so I'm assuming I did save fine, it just didn't save properly. Tell me there's some way to recover this work. Things were about 80% complete, and If I gotta start over this is a day down the drain...
-
So simple I'm embarrassed to have not thought of it myself. Nice loop action happening.
Thanks bluegoat!
-
I don't see why you couldn't set it up to run with Windows Scheduled Tasks. Shouldn't really need CRON, though you can install CRONw on a windows server.
-
Being able to save files and name them is a MUST. I really want to be able to save text files and open them back up in ubot (by reusing the save location and file name variables to open those files).
Remember, SAVE AS, not SAVE!
Agreed
-
Can't wait for a $local folder to come in. until then I'll have to try this out with a compiled bot.
-
Maybe I'm just not seeing it, but I fail to see where the replacement is happening there at all.
What I have is a list, that I want to remove certain parts from. Another good example would be something like an html snippet that you want to remove all italics and bold font styles from.
<p>
The dog walked <strong>across</strong> the road
</p>
<p>The cat ran <em>away from</em> the dog.
</p>
So I'd have to replace <strong>, </strong>, <em>, & </em> all with a $nothing variable.
Now i can easily take one of those strings of text out, but it's the next one where I have problems. At least moving from removing all spaces, on to removing all ' & " I'm having difficulties. I can get the spaces all out, but then getting rid of the quotations and other invalid characters does not work.
I am stripping down keyphrases for input into a form. occasionally the phrases contain an invalid character, I need to strip all of these invalid characters down.
The snippet posted seems to remove entire lines from the list. I need to remove characters (or entire strings in the case of the HTML example) from each item.
weird scrape protection in place on site, any ideas for an easier work around?
in Scripting
Posted
Alright, I'm scraping a site for specs on different electronics. Each item on the site has a large number of different values to scrape and move into a CSV. Some pages have some specs while others don't, so I am doing a if command to search the page for each value's heading before setting the variable value and adding it into the CSV string.
Then I encountered a problem. From page to page the headings for each value are different:
Digital<FONT color=#fefefe>:</FONT>Media<FONT color=#fefefe>_</FONT>Broadcast<FONT color=#fefefe>_</FONT>Tuner:
Digital<FONT color=#fefefe>;</FONT>Media<FONT color=#fefefe>+</FONT>Broadcast<FONT color=#fefefe>-</FONT>Tuner:
Digital<FONT color=#fefefe>_</FONT>Media<FONT color=#fefefe>;</FONT>Broadcast<FONT color=#fefefe>;</FONT>Tuner:
Digital<FONT color=#fefefe>+</FONT>Media<FONT color=#fefefe>-</FONT>Broadcast<FONT color=#fefefe>_</FONT>Tuner:
They are hiding different characters in between the text, matched to the background color of the table cell. So simply doing a page search for "Digital Media Broadcast Tuner" for example will not work on each page, due to this protection. Now i could go through and create an if command for every single value, but let's face it, there are A LOT of values to scrape, and creating 4-16 if commands for every single value is going to take ages. Is there no way I can use a wildcard or something to ease the pain here?
Any ideas would be appreciated.