Archive for the 'science technology' Category

Jan 02 2010

JavaCC Parser for WordNet 3.0 Noun Data Set

Published by Forager under science technology

Here is a bare-bone parser (parse_wn.jj) that reads WordNet 3.0 noun data set (dict/data.noun).

The parser is not perfect: when parsing the original data.noun file as unpacked from WordNet, it would fail at the entry “zero”. Similarly, it requires all words in the dictionary not start with “0″ (which just happen to be the case). However, this is as close as I can get after two days’ work. “Zero” is the only entry that fails, and it would work if you put a double quote around the digit 0 in the synset ring.

Last time I wrote a serious parser was 15 years ago.  I always enjoyed writing parsers. It is like writing poem, in a very strange way–you have to choose your words carefully. But if you are successful, you can express a lot things with very few words.

This time around, the effort started quickly but stuck in the sand soon after. More than once, I felt like a lab rat running around a maze–I could smell the cheese but still find a thin plastic wall in between. So this is the best I can get.

Right now, I am working on a Taxonomy related project, one that really kills brain cell. But is exciting as hell.

No responses yet

Mar 02 2009

EC2 Is Best Suited for Overflow Traffic

Published by Forager under science technology

Recently, played with Amazon.com’s EC2 (Elastic Computing Cloud). Here is what I have learned:

1. EC2 is NOT Grid Computing. Many people are confused by the term “Cloud”, so was I. Bezos said it probably should be called “infrastructure cloud”.

2. Cost wise, EC2 is comparable to dedicated web hosting (EC2 cost calculator vs. my ISP’s dedicated server page). But has much stronger value prop.  It is particularly useful in a situation where temporary overflow traffic is concerned. In other words, If you are to add a new country or a new product line to your website, you may need to add a whole new server permanently.  However, if you are running a promotion, you almost want to rent a readily configured server for a day or two. That is where you need EC2 (where every machine is a software copy essentially).

3. MSFT has a competing product Azusa. It is very tightly integrated with their Visual Studio IDE. As someone said it so well, Azusa empowers developer whereas EC2 empowers operations. But how many companies who are in need of cloud computing would entrust operations to developers?

4. EC2, despite its strong value prop, is not the best for starters (e.g. small startup, ma-pa shop, or small community sites) cost-wise. If a dedicated server is “elastic” enough for your site’s peak traffic, you may not need EC2. Only if you need more than a dedicated server, you definitely want to consider EC2. 

5. EC2 is pretty easy to use. Still, the virtual machines do not persist, so you have to use another Amazon product EBS (block storage) for everything that changes. This is the small overhead for the easy scability.

Overall, EC2 has a very targetted audience. I imagine those who use it would love it wholeheartedly.

One response so far

Oct 28 2007

How to Sync Motorola KRZR K1 with Windows

Published by Forager under science technology

I bought a Motorola KRZR K1 on eBay a year ago. The default network and timezone was set to Hong Kong. Since I got the phone, I had endless troubles getting it sync-ed up with Outlook on Windows: the Motorola Phone Tool worked for a while, then an update later it stopped sync-ing (charging is fine).

Motorola’s website was least helpful. Although K1 has been on sale in the States, it is not listed on their U.S. website. Finally I tried the HK/CN website and found a firmware update here.

Once the firmware was updated, all is well.

No responses yet

Next »