The concept is the same...we create a PhantomJS script that will read a "user" Twitter page and get the hashtags of the first 5 pages...here's the PhantomJS script...
| Hashtags.js |
|---|
var system = require('system');
var webpage = require('webpage').create();
webpage.viewportSize = { width: 1280, height: 800 };
webpage.scrollPosition = { top: 0, left: 0 };
var userid = system.args[1];
var profileUrl = "http://www.twitter.com/" + userid;
webpage.open(profileUrl, function(status) {
if (status === 'fail') {
console.error('webpage did not open successfully');
phantom.exit(1);
}
var i = 0,
top,
queryFn = function() {
return document.body.scrollHeight;
};
setInterval(function() {
top = webpage.evaluate(queryFn);
i++;
webpage.scrollPosition = { top: top + 1, left: 0 };
if (i >= 5) {
var twitter = webpage.evaluate(function () {
var twitter = [];
forEach = Array.prototype.forEach;
var tweets = document.querySelectorAll('[data-query-source="hashtag_click"]');
forEach.call(tweets, function(el) {
twitter.push(el.innerText);
});
return twitter;
});
twitter.forEach(function(t) {
console.log(t);
});
phantom.exit();
}
}, 3000);
});
|
Now...what I want to do with this information...is to send it to Haskell...and get the most used hashtags...so I will summarize them and then get rid of the ones that only appear less than 5 times...
Let's see the Haskell code...
| hashtags.hs |
|---|
import System.Process
import Data.List
hashTags :: String -> IO()
hashTags(user) = do
let x = readProcess "phantomjs" ["--ssl-protocol=any","Hashtags.js",user] []
y <- x
mapM_ print $ sortBy sortGT $ count y
count :: String -> [(String,Int)]
count xs = filter ((>=5).snd) $
map(\ws -> (head ws, length ws)) $
group $ sort $ words xs
sortGT :: (Ord a, Ord a1) => (a1, a) -> (a1, a) -> Ordering
sortGT (a1, b1) (a2, b2)
| b1 < b2 = GT
| b1 > b2 = LT
| b1 == b2 = compare a1 a2
|
The nice thing about this app is that we can pass any username as parameter and the result is going to nicely ordered and filtered...another reason to love Haskell -;)
Greetings,
Blag.
Development Culture.


No hay comentarios:
Publicar un comentario