martes, 23 de septiembre de 2014

Web scrapping with Haskell and PhatomJS

Some time ago I wrote a blog called Web scrapping with Julia and PhantomJS...today...I wanted to do the same but using Haskell instead...

The concept is the same...we create a PhantomJS script that will read a "user" Twitter page and get the hashtags of the first 5 pages...here's the PhantomJS script...

Hashtags.js
var system = require('system');

var webpage = require('webpage').create();
webpage.viewportSize = { width: 1280, height: 800 };
webpage.scrollPosition = { top: 0, left: 0 };

var userid = system.args[1];
var profileUrl = "http://www.twitter.com/" + userid;

webpage.open(profileUrl, function(status) {
 if (status === 'fail') {
  console.error('webpage did not open successfully');
  phantom.exit(1);
 }
 var i = 0,
 top,
 queryFn = function() {
  return document.body.scrollHeight;
 };
 setInterval(function() {
  top = webpage.evaluate(queryFn);
  i++;
   
  webpage.scrollPosition = { top: top + 1, left: 0 };

  if (i >= 5) {
   var twitter = webpage.evaluate(function () {
    var twitter = [];
    forEach = Array.prototype.forEach;
    var tweets = document.querySelectorAll('[data-query-source="hashtag_click"]');
    forEach.call(tweets, function(el) {
     twitter.push(el.innerText);
    });
    return twitter;
   });

   twitter.forEach(function(t) {
    console.log(t);
   });

   phantom.exit();
  }
}, 3000);
});

If we run the script we're going to see the following output...


Now...what I want to do with this information...is to send it to Haskell...and get the most used hashtags...so I will summarize them and then get rid of the ones that only appear less than 5 times...

Let's see the Haskell code...

hashtags.hs
import System.Process
import Data.List

hashTags :: String -> IO()
hashTags(user) = do
 let x = readProcess "phantomjs" ["--ssl-protocol=any","Hashtags.js",user] []
 y <- x
 mapM_ print $ sortBy sortGT $ count y

count :: String -> [(String,Int)]
count xs = filter ((>=5).snd) $ 
     map(\ws -> (head ws, length ws)) $ 
           group $ sort $ words xs

sortGT :: (Ord a, Ord a1) => (a1, a) -> (a1, a) -> Ordering
sortGT (a1, b1) (a2, b2)
  | b1 < b2 = GT
  | b1 > b2 = LT
  | b1 == b2 = compare a1 a2

When we run this code...we're going to have this output...


The nice thing about this app is that we can pass any username as parameter and the result is going to nicely ordered and filtered...another reason to love Haskell -;)

Greetings,

Blag.
Development Culture.

miércoles, 17 de septiembre de 2014

Learn You a Haskell for Great Good! - Book review

Finally! I have finished reading Learn You a Haskell for Great Good! and here's my review -;)

After learning Erlang, I decided it was a good idea to get hardcore and learn an even more "Pure" functional programming language...Haskell was of course my best shot...

This book is just pretty awesome...it has 404 pages...so it's a big book...

The examples are really nice and easy to follow...of course...once you get your head around Haskell and it's weird way of doing things...after a few headaches and curses...you found yourself loving the language...

This book will teach you about recursion, pattern matching, guards, monoids, and a handful of things that you couldn't find on a imperative language...



If you want to learn Haskell...this book is definitely for you...written with lots of humor and great examples it makes learning Haskell a real joy...I would recommend it a 100%...

If you haven't ever try a functional language...I will recommend you to do so...you will not regret it...you will become smart and a better developer...sure...you will get frustrated a thousand of times...but there's a price to pay when it comes to becoming better at something...

Haskell will help you to become better and this book will guide you through the hard path...

Greetings,

Blag.
Development Culture.

martes, 2 de septiembre de 2014

Decimal to Romans - Haskell Style

As promised...here's my Haskell take on Decimal to Romans...I got say...it took me considerably less time to build it Haskell that it took me to build it Erlang...but for sure...I had already done it in Erlang...so I had some sort of advantage -;)

Anyway...this was so much fun to do...and couldn't be happier with the overall process...Haskell being so pure is really a joy to work with...

Too much talk...here's the source code...

Roman_Numerals.hs
showRomans :: Int -> IO()
showRomans(num) = do
 putStr $ concat $ get_roman num 0

get_roman :: Int -> Int -> [[Char]]
get_roman num ctr
 | num >= roman = make_roman(roman) ++ get_roman(num - roman) ctr
 | num < roman && num > 0 = get_roman(num) (ctr+1)
 | num <= 0 = ["\n"]
 where roman = roman_keys [] !! ctr

make_roman :: Int -> [[Char]]
make_roman(1) = ["I"]; make_roman(4) = ["IV"]; make_roman(5) = ["V"];
make_roman(9) = ["IX"]; make_roman(10) = ["X"]; make_roman(40) = ["XL"];
make_roman(50) = ["L"]; make_roman(90) = ["XC"]; make_roman(100) = ["C"];
make_roman(400) = ["CD"]; make_roman(500) = ["D"]; make_roman(900) = ["CM"];
make_roman(1000) = ["M"]

roman_keys :: [Int] -> [Int]
roman_keys keys = [1000,900,500,400,100,90,50,40,10,9,5,4,1]

As usual...here's the screenshot...


After so many blogs in just a few days...I think I deserve a break, so I can keep reading the Haskell book...as I promised a review of it...

Greetings,

Blag.
Development Culture.