Jump to content
UBot Underground

Get rid of duplicate scraped variables in list


Recommended Posts

Hi Folks

 

I have come up against an odd scraping issue.

 

I am scraping the following html

 

This question has been viewed <strong>988</strong> times; it has <strong>1</strong> monitor with <strong>11277</strong> topic followers and <strong>0 </strong><a href="/Does-a-vanity-URL-shortener-improve-SEO/alias">aliases</a> exist.

 

here is the code

 

if($exists(<innertext="Question Stats">)) {

then {

comment("Question viewed - times")

set(#QuestionViewed, $page scrape("This question has been viewed ", " times; it has "), "Global")

set(#QuestionViewed, $replace(#QuestionViewed, "<strong>", ""), "Global")

set(#QuestionViewed, $replace(#QuestionViewed, "</strong>", ""), "Global")

set(#QuestionViewed, $trim(#QuestionViewed), "Global")

add list to list(%ListViewed, $list from text(#QuestionViewed, ""), "Don\'t Delete", "Global")

comment("monitor amount")

set(#QuestionMonitor, $page scrape(" times; it has ", " monitor with "), "Global")

set(#QuestionMonitor, $replace(#QuestionMonitor, "<strong>", ""), "Global")

set(#QuestionMonitor, $replace(#QuestionMonitor, "</strong>", ""), "Global")

set(#QuestionMonitor, $trim(#QuestionMonitor), "Global")

add list to list(%monitor, $list from text(#QuestionMonitor, ""), "Don\'t Delete", "Global")

comment("Topic followers")

set(#QuestionFollower, $page scrape(" monitor with ", " topic followers and"), "Global")

set(#QuestionFollower, $replace(#QuestionFollower, "<strong>", ""), "Global")

set(#QuestionFollower, $replace(#QuestionFollower, "</strong>", ""), "Global")

set(#QuestionFollower, $trim(#QuestionFollower), "Global")

add list to list(%topicFollowers, $list from text(#QuestionFollower, ""), "Don\'t Delete", "Global")

comment("Following this question")

set(#QuestionStatFollowingQuestion, $scrape attribute(<class="following_count">, "innertext"), "Global")

set(#QuestionStatFollowingQuestion, $replace(#QuestionStatFollowingQuestion, " people are following this question.", ""), "Global")

set(#QuestionStatFollowingQuestion, $replace(#QuestionStatFollowingQuestion, " person is following this question.", ""), "Global")

set(#QuestionStatFollowingQuestion, $trim(#QuestionStatFollowingQuestion), "Global")

add list to list(%QuestionFollowers, $list from text(#QuestionStatFollowingQuestion, ""), "Don\'t Delete", "Global")

}

 

When watching in the debugger the original number scraped is wrapped in <strong> </strong> tags

I strip away the tags and trim the number but I end up with the number repeated in the variable

 

example of variable output:

Should be single number, not repeated

974

974

 

see attached image of debugger

 

I am building up a data set and we need to allow duplicates in the column but not in the cell.

 

How do i validate this to ensure only one number is in the variable before I add it to the list?

Is this a bug or have I done something wrong?

 

Thanks for any suggestions

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...