sábado, 9 de junio de 2012

I'm following you in Twitter...are you following me back?

If you spend some time on Twitter, you might have some followers and some people that you follow...the more time you spend, the more people you're going to interact with...

Sometimes, you just realized that you're following some many people that might or not follow you back...for some "accounts", it doesn't matter...I mean...if I follow @annafaris I don't expect her to follow me back...would love that of course, but I have some common sense -:) But...when it's a John Doe that I follow...and doesn't follow me back...things get personal...and it's time to clean up Twitter a little bit...

Twitter provides some useful APIs that are sadly restricted to only 150 calls per hours as you can verify by calling Rate_Limit_Status.

Anyway...I was thinking about doing something with Twitter and specially the people that I follow and doesn't follow me back...so of course...I choose #R as I have already done some interesting things with Python...

setwd("C:/Debug/R Source Codes")

Get_Twitter_Info<-function(p_source){
  web_page<-readLines(p_source)
  mypattern = '<id>([^<]*)</id>'
  datalines = grep(mypattern,web_page,value=TRUE)
  getexpr = function(s,g)substring(s,g,g+attr(g,'match.length')-1)
  g_list = gregexpr(mypattern,datalines)
  matches = mapply(getexpr,datalines,g_list)
  result = gsub(mypattern,'\\1',matches) 
  names(result) = NULL
  return(result)
}

Get_Screen_Name<-function(p_userid){
  user_url<-paste("https://api.twitter.com/1/users/lookup.xml?user_id=",
                      p_userid,"&include_entities=false")
  web_page<-readLines(user_url)
  mypattern = '<screen_name>([^<]*)</screen_name>'
  datalines = grep(mypattern,web_page,value=TRUE)
  getexpr = function(s,g)substring(s,g,g+attr(g,'match.length')-1)
  g_list = gregexpr(mypattern,datalines)
  matches = mapply(getexpr,datalines,g_list)
  screen_name = gsub(mypattern,'\\1',matches)
  names(screen_name) = NULL
  return(screen_name)
}

trim <- function(x){
  x<-gsub(' ','',x)
  return(x)
} 

followers<-Get_Twitter_Info("https://api.twitter.com/1/followers/ids.xml?
                                cursor=-1&screen_name=Blag")
following<-Get_Twitter_Info("https://api.twitter.com/1/friends/ids.xml?
                                cursor=-1&screen_name=Blag")

People_Id<-""
Bad_People<-c()
Bad_Names<-c()
j<-0

for(i in 1:length(following)) {
  j<-j+1
  if(j>=100){
    j<-0
    People_Id<-substring(People_Id,2)
    Bad_People<-Get_Screen_Name(People_Id)
    Bad_Names<-append(Bad_Names,Bad_People)
    People_Id<-""
  }
  Match<-following[i] %in% followers
  if(Match == TRUE){
  }
  else{
    following[i]<-trim(following[i])
    People_Id<-paste(People_Id,following[i],sep=",")
  }
}

write.csv(Bad_Names,"Bad_Names.csv",row.names=FALSE)

This little program will take my followers (from my account @Blag), and the people I follow...a simple loop at the people I'm following allows me to determine who is following back or not. With that identified, I made groups of 100 User Id's (As the Lookup API only support 100 accounts) and grab their user names...

Finally, I generate a .CSV file with all the people who I follow but doesn't follow me back...time to clean up my Twitter -;)

P.S: Would love to show the generated file...but...don't want to expose the names of  the Bad People, who only but unforgivable crime is not to follow me back -:)

Greetings,

Blag.

No hay comentarios: