[Computing] F# revisited

I was interested in F# because of some of its fun usages, and the ease to do asynchronous and parallel computing. Used F# to implement some domain specific language like Turtle and fractal effect this afternoon. Basically the tasks themselves are nearly the same as this post and this post, some own thoughts were inspired.

Let's first see some examples. For the turtle system, below is a command sequence (actually a linked list) for drawing a triangle:


let koch1 = [ Fwd 243.; Right 120.; Fwd 243.; Right 120.;

Fwd 243. ]

 

It's easy to understand, first draw a line with length 243 pixels along the current direction, and then turn right 120 degrees, draw another line with length of 243 pixels and turn 120 degrees again, following by the final edge with length 243 pixels.

 

What I want to emphasize is, this is a simple language, a so-called domain-specific language (DSL). It's extremely easy to implement some embedded DSLs inside F# due to its flexible syntax.

 

Let's see some amazing  sample usage of this tiny language with only 3 keywords and 1 data type. The following function takes a command sequence as input, and "expand" it as a longer one. More specifically, this function is used to expand a straight line into a certain line segment as follows:

Code:

let kochChange drawing acc = [

    for cmd in drawing do

        match cmd with

        | (Fwd, n) -> 

            let x = n / 3.0

            yield Fwd x

            yield Left 60.

            yield Fwd x

            yield Right 120.

            yield Fwd x

            yield Left 60.

            yield Fwd x

        | c -> yield c ]

Note the parameter acc is a dummy one, making it suit the interface of List.fold below. Then We can simply draw some amazing fractal shape easily:

 

[1..4] |> List.fold kochChange koch1

 

Note koch1 is defined before to draw a triangle, therefore this statement expands a triangle with the rule shown above recursively, and produce a command list for the following figure:

 

The complete code is available in my github. The famous Mandelbrot set is also easy to draw with this code, with asynchronous and parallel computing enabled.

 

After this exercise, I feel more comfortable to use F# and find it appealing in the following aspects:

  • Lightweight yet rich type definition. New types or structures can be defined neatly and even within one line. Examples are like

 

type Command = Fwd | Left | Right

type MandelbrotResult = | DidNotEscape | Escaped of int

 

This makes the extensive usage of object-oriented style very easy. For example, use a new enumeration as place holders in DSLs, like this post illustrates.

  • Compact yet flexible syntax. The unique syntax like pattern matching, treating tuple as first class component makes it easy to write intuitive logic within one to two lines, while other languages may need 10 lines. For example, the following code computes the Mandelbrot iteration result of a given complex number, and return the iteration number if it's convergent, otherwise output 0:

 

match mandelbrot (Complex(x, y)) with | DidNotEscape -> (x, y, 0) | Escaped i -> (x, y, i)


  • Convenient collection interfaces. F# not only supports traditional functional ones like map, filter, fold, and also lazy sequence generation. The keywords like yield makes long LINQ statements in C# like nested SelectMany and Select pretty easy and intuitive to write. An example is the kochChange function above.

  • Easy asynchronous and parallel programming, such as non-blocked operations, parallel operations, etc. This is clearly illustrated in Don's post.

 

Overall, F# is a pretty interesting language, lightweight, powerful, and neat. And finally I feel sort of mastering it... Will further explore its potential for scientific computing and possibly updated again soon. 


Tags: , , , ,
Categories: Computing

0 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

[Alive] Lumia 900

 

Got a Lumia 900 after using iPhone for quite a long time. It's very different from iPhone. Some are good, while some are bad. 

  • Smooth OS experience: This is a big plus. The system is very responsive, and the back key is pretty handy. 

  • Input method: The keyboard amazes me! Both English and Chinese input methods are excellent. Auto correction makes the input experience just like on an iPad, it can almost guess every single word you input correctly. And the Chinese input method is even better than sogou. This makes the phone very suitable for business. 

  • Tile interface: Compared to iPhone and android's interfaces, personally I like this design! Actually most apps will not be often used. "Pin to start" balanced the "popular" apps and the "regular" ones.  

  • UI: Some designs are good, like there is a "share on facebook" in the menu of photos. But there are no SBSettings-like toggles for WiFi, Flight mode, etc. And some apps will just make the CPU usage 100%, and there is no way to terminate processes by users, thus I have to restart the phone. 

  • Apps: well, there are fewer apps in Windows platform currently, but you'll find most popular ones there. I have Office, Skype, Amazon Kindle, Dropbox, Facebook, SSH client, Yelp, etc., which are quite enough for business, especially considering Personal Information Management Suite like People and Calendar has been put in the phone. 

  • IE10: I have to say, this is worse than Safari. You have to tap quite a few times to switch tabs or enter favorites. And the Gmail interface is as naïve as non-smartphones. The speed is good though.  

In summary, the impression Windows Phone gives me is fast. It's super fast. But there are really some places Microsoft needs to polish more.

 


Tags: ,
Categories: Alive

0 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

[Computing] Thoughts about convex optimization

Just finished the final of convex optimization. This is really an interesting course, which even changed the way I'm looking at this world. In college, I was thinking in an algorithmic way, i.e., to solve a problem, will try to design some certain steps, following which we can get a result somehow satisfying our expectation. But now, I'd first spend quite a bit time on the expectation itself, expressing them in the way "what to minimize" and "subject to what constraints". Then simply feed them to convex optimization solvers. 

 

That is, convex optimization provides a general modeling language, with which it can be ensured to get an answer efficiently. It makes the modeling process intuitive and effective, and is the most important practical branch of mathematics in my humble opinion. 

 

I used image denoising as the example for quite a few algorithms in the blog [post1post2post3], and demonstrated how different algorithms help improve the performance. But what if we treat it as a optimization problem? We wish to have less effort to fix/change the image, and the fixed image tends to have smoother changes, i.e. less variations. This is a typical problem named total variation reconstruction, trying to minimize a weighted sum of some L1 norms, and it's convex! By writing 4 lines of MATLAB code, we get this result: 

 

 

The left one is the noisy image, while the right is the recovered one. Compared with the result from Loopy Belief Propagation, the improvement is obvious. So we can see that the optimization approach is intuitive to model, fast to implement, and easier to debug. The code written with CVX, a MATLAB convex optimization package, is put as follows for the record. We can see how elegant it is. 

 

cvx_begin 

variable x(h, w); 

minimize norm(x - img, 1) + norm(x(:, 2:w) - x(:, 1:w-1), 1) + norm(x(2:h, :) - x(1:h-1, :), 1); 

cvx_end 

 

As a heavy .NET user, I tried to find .NET version of optimization problem solvers (due to licensing issues of MATLAB). Microsoft Solver Foundation looks a good candidate, also supporting a neat and expressive language named OML just like CVX does, and free for academic users. But I still haven't figured out how to use it. Gurobi looks simple and powerful too, although haven't tried it. Anyway, CVX is certainly a good start point. 

 

If you get interested in convex optimization, Steven Boyd's course Convex Optimization is definitely a good (possibly the best) resource, with the textbook also from Steven here. If don't feel like much theory, skimming the exercises in the book and the additional exercises about your area may be enough.


Tags: , , ,
Categories: Computing

0 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

[Alive] A new start...

Long time no updates here... Need to have some fresh new start. Anyway, blog is a good place to organize the thoughts and keep good habits.

 

Record the life

As getting busier, it's also harder for friends to get to know your life. As a PhD student, with less time for twittering, fbing, and calling with friends, blog is becoming one of the few ones surviving for friends to know me. That pushes me to restart the posts with "Alive" tags, which is for recording the life. This tag was closed for about 2 years because I just want to make the blog look professional, but it seems the demo website and the academic homepage fit better.

So, will try to make this a frequently-updating place, and hopefully you'll see my life here, unlike the fragments in facebook.

 

Morning reading

Read some books recently, like "Brain bugs" and "The personal credibility". Quite inspiring books, although more suitable for quick digest rather than patient reading. Also inspired from Yun Zhou (link in Chinese), begin thinking about fixing some time for book reading, just like the freshman year getting up at 6:30am reading English. :) So, plan to begin tomorrow, getting up at 06:30, begin reading in front of Low Library from 06:45 to 07:30, and then share the thoughts in the blog/weibo!

 

Plan for the win

Resumed the jogging yesterday and begin using Mint today. Have a strong feeling that plan/budget is really important. When having a goal such as don't stop for 3.5km, I can clearly tell I'm doing good or bad (and finally make it). Making progress looks so difficult and even impossible without even a goal.

Time for the first series of the goals.

1) Do morning reading till leaving for intern.

2) Jog every two days till leaving for intern.

3) Do morning reading till intern ends.


In summary, feeling a new life growing with the semester ends. Let it begin. 


Tags:
Categories: Alive

2 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

[Computing] Automatic Recognition of EMS’s CAPTCHA

CAPTCHA is widely used to distinguish humans from robots, mainly preventing improper automations. However, sometimes we really need some automation which actually won’t throw much burden to the server. An example is to check the status of an EMS (Express Mail Service, an “official” courier service provider hosted by China Post) package every half an hour. This article will illustrate how to recognize the CAPTCHA with basic image processing and machine learning techniques.

The CAPTCHA of EMS can be fetched from http://www.11183.com.cn/emsService/ems/rand. It usually looks like rand2, with colorful digits and clutter background. Note there is no distortion on foreground digits, and the digits usually appear in relatively fixed positions. This makes our recognition a lot easier. The recognition process can be divided into three stages, image binarization, digit splitting, and digit recognition.

Image binarization means making the chromatic image into black and white. The usual approach is first turning the image into gray-scale, computing a gray-scale threshold, and then binarizing the image with the threshold. The first and third steps are fairly easy, while the threshold computation looks harder but has a mature algorithm named Otsu’s method proposed in 1979. MATLAB (with image processing toolbox) has a good implementation of the algorithm, and has a function named im2bw for the overall process of image binarization. The output binary image of the sample CAPTCHA is like ems_binarized. Perfect!

Now the problem gets a lot more promising. Splitting the digits gets trivial base on such binary images. But to accelerate the development progress, we measure the positions of each digits manually, and split digits by such fixed positions. This approach works, although with some errors (in one pixel or two). The splitting results are as following images show.

1  2  3  4  5  6

After looking into the split digits, some problems emerge. Take the digit 3 as an example like the following images. We can notice the sizes, relative positions (of the digit in the frame) are not exactly the same. And there are sometimes holes or black blobs from binarization. This makes digit recognition harder, i.e. exact matching will not work any more. There are two intuitive approaches to solve the problem. One is normalization, i.e. using image processing techniques to calibrate size and position of both training and testing images, and using erosion and dilation to eliminate holes/blobs before exact matching. And the other approach is to collect lots of training data, to cover most variances of each digit.

2_4  4_5  6_3  9_3  9_6  11_1  13_5  17_5  19_1

Which way is better really depends on the data. If there are lots of factors affecting the digits’ appearance, model-driven calibration is much more efficient, without needing to collect/generate enormous possible training data. But if the data are fairly simple and nearly the same, like the case here, collecting enough training data requires much less efforts, comparing to designing, implementing the calibration model and debugging. Here we only collect 20 sample CAPTCHAs with one line of PowerShell script, which is proved to work perfectly in practice.

To deal with the variance of the data while maintaining a least effort in implementing the model, we adopt a K Nearest Neighbor approach. That is, we compare the testing image with every image in the training set, choose the “most similar” one in the sense of Hamming Distance, and adopt the label of that “most similar” image as the recognition result. With like 30 lines of C# code (thanks to LINQ, we can write compact codes comfortably), we got perfect recognition results on our testing set. For the sample CAPTCHA, the program provides 761803, which is exactly the correct answer.

Digging more deeply, this model is actually very discriminative. Following figure shows the distance of each digit candidate to an image of “3”. It’s easy to see the least distances of other digits are obviously larger than that of 3, which is actually 0.

ems_discriminativity

There are two inspirations from this small “project”. One is, “large-scale” data can really replace sophisticated algorithms (like this article in Wired says. I add quotes because the scale of data only needs to be large enough, which is not always absolutely large). In our case, that is we choose the effort to collect more training data against design and implement a complex image calibration model. (Of course, data is not always a better solution as we analyzed before) Other than data issues, another inspiration is integrating various programming techniques, making them working for most suitable situations, can obviously improve the development efficiency, as illustrated in this post before. Rather than arguing about which technique is “better”, it’s more important to figure out suitable application scenarios of different techniques and apply them.


Tags: , , ,
Categories: Computing

11 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed