Jump to content
UBot Underground

jewcat

Fellow UBotter
  • Content Count

    105
  • Joined

  • Last visited

Posts posted by jewcat

  1. Alright, I'm scraping a site for specs on different electronics. Each item on the site has a large number of different values to scrape and move into a CSV. Some pages have some specs while others don't, so I am doing a if command to search the page for each value's heading before setting the variable value and adding it into the CSV string.

     

    Then I encountered a problem. From page to page the headings for each value are different:

     

    Digital<FONT color=#fefefe>:</FONT>Media<FONT color=#fefefe>_</FONT>Broadcast<FONT color=#fefefe>_</FONT>Tuner:

    Digital<FONT color=#fefefe>;</FONT>Media<FONT color=#fefefe>+</FONT>Broadcast<FONT color=#fefefe>-</FONT>Tuner:

    Digital<FONT color=#fefefe>_</FONT>Media<FONT color=#fefefe>;</FONT>Broadcast<FONT color=#fefefe>;</FONT>Tuner:

    Digital<FONT color=#fefefe>+</FONT>Media<FONT color=#fefefe>-</FONT>Broadcast<FONT color=#fefefe>_</FONT>Tuner:

     

    They are hiding different characters in between the text, matched to the background color of the table cell. So simply doing a page search for "Digital Media Broadcast Tuner" for example will not work on each page, due to this protection. Now i could go through and create an if command for every single value, but let's face it, there are A LOT of values to scrape, and creating 4-16 if commands for every single value is going to take ages. Is there no way I can use a wildcard or something to ease the pain here?

     

    Any ideas would be appreciated.

  2. I agree that the UI could be far more flexible, and one of the areas that ubot is lacking most is in options for compiled bots (ex. i would love to be able to set the onload url, so the bot opens to a webpage instead of a white blank upon opening), but in the long run I'd say that UI flexibility is an improvement that can wait a good while yet. There's lots of long needed improvements still on the waiting list to improve the actual functionality of ubot that are far more key than making compiled bots look prettier for the end user.

  3. Was going to continue this thread http://ubotstudio.com/forum/index.php?/topic/1943-seeking-help-with-squidoo-lens-making-bot/page__hl__squidoo__fromsearch__1 but it seems no one had anything to say and it doesn't sound like the OP ever made any progress.

     

    The issue I'm seeing is that when you click an edit button you get this flashy dHTML lightbox sort of popup window. I encountered the same thing a few times, and it never seems to let me select any elements within these popups.

     

    Is there a way to handle these? Same sort of popups boxes you see in Facebook, and Google Buzz...

  4. I'm using a piece of php on my server. The bot inputs a directory name in a text input, and a list of scraped image URLs into another text input. The files are then loaded directly to my server for working with. had it hacked together by a coder on DP. Only issue is that it seems to drop about 20% of the images without fully downloading them.

     

    Still can't wait for an actual Save As feature to come along.

  5. having a hell of a time trying to load the article text in a file here on this particular article site. i have tried $page scrape $scrape chosen attribute. tried loading it into a variable, tried loading it into a list. Just can't seem to pull the data from here. Any suggestions?

     

    Demo article url: http://www.articledashboard.com/Article/Celebrity-Gossip--Celeb-Relationships-/1415867

     

    I tried $page scraping between the <p></p>, which I thought would be the best option, but than I discovered it was scraping a different element elsewhere on the page. So I tried stepping up and scraping between the <td></td> and it returns a blank value.

     

    Any suggestions?

  6. I have a funny feeling I am not going to like the answer I get from this thread, but is there any way to recover a corrupted bot source? I have been working on a project for someone and just had a freeze up on my computer. I saved the bot I was working on, rebooted the computer, and came back to ubot to get things completed. Every time I attempt to open the file, ubot crashes. I can open my other saved files just fine.

     

    Is this file lost forever? Nothing popping up for the emergency backup feature, so I'm assuming I did save fine, it just didn't save properly. :( Tell me there's some way to recover this work. Things were about 80% complete, and If I gotta start over this is a day down the drain...

  7. Maybe I'm just not seeing it, but I fail to see where the replacement is happening there at all.

     

    What I have is a list, that I want to remove certain parts from. Another good example would be something like an html snippet that you want to remove all italics and bold font styles from.

     

     

    <p>

    The dog walked <strong>across</strong> the road

    </p>

    <p>The cat ran <em>away from</em> the dog.

    </p>

     

     

    So I'd have to replace <strong>, </strong>, <em>, & </em> all with a $nothing variable.

     

    Now i can easily take one of those strings of text out, but it's the next one where I have problems. At least moving from removing all spaces, on to removing all ' & " I'm having difficulties. I can get the spaces all out, but then getting rid of the quotations and other invalid characters does not work.

     

    I am stripping down keyphrases for input into a form. occasionally the phrases contain an invalid character, I need to strip all of these invalid characters down.

     

    The snippet posted seems to remove entire lines from the list. I need to remove characters (or entire strings in the case of the HTML example) from each item.

×
×
  • Create New...