Log in

No account? Create an account

Natural frequencies versus probabilities
In the recent years, a body of research has indicated that most people handle natural frequencies better than probabilities, when it comes to statistical reasoning. A probability is a number from zero to one, such as 0.6, and the same information can be encoded using natural frequencies as "6 out of 10". Prior to recent exposure by the media, I have not heard of the term "natural frequencies", and while I am not surprised of the conclusion, I find it interesting that the reported differences in correct statistical reasoning using natural frequencies versus probabilities is so stark.

Read more...Collapse )

Younger Imaginations on Public Transport

Trains are supposed to be fast. However, our MRT is not as fast as it could have been, because the trains stop at every station. So instead of travelling at the maximum speed of 80 km/h, its averages a speed of about 40 km/h or so.

When I was much younger, I had imagined a system where the entire island of Singapore could be covered by a network of travellators. At certain points, two travellators may be adjacent to each other, so people could switch travellators if they want to go to a different direction. Many years later, I saw the same kind of concept in an amusement park for visitors to get on and off a particular ride. The problem is, how does the travellator manage the transition of zero speed to a very high speed and vice versa in a safe manner.

A later incarnation of the imagination involved two tracks: a main track and a side track. On the main track, train cars run at a constant high speed and never stops. The station platforms are located on the side tracks. Passengers will board the car at the platform, which accelerates to meet the car at the main track. Then the cars at the side track gets coupled with the main car, much like how spaceships dock on the International Space Station, facilitating exchange of passengers between the two cars. Subsequently, the car at the side track will decelerate and stop at the next station, while passengers going to further destinations can continue on at full speed.

It is easy to see that the latter scheme is also full of problems. When I dreamed of these schemes, the MRT was in much better shape compared to today. And I assumed that engineering can only get better, and eventually much of these problems can be resolved. However, there remained a couple of intractable problems. Firstly, the cars on the main track and side track can only be coupled together for a limited amount of time between stations, which means that it is extremely unsafe when something gets stuck between the two cars for an extended amount of time. Secondly, if stations are placed very closely together, then there may be very little or no time for the cars to be coupled together.

Even later, I discovered express trains in Japan. Basically, some stations have express tracks that allow a train to skip the station, overtaking another train that needs to stop at the station, allowing train operators to offer express services with limited stops. Obviously, express tracks are feasible from the engineering point of view, with existing implementations. On the other hand, the schemes that I had imagined were far too complicated and cannot be achieved easily. I feel that new lines on our own MRT should consider adding express tracks.

Interviews for Software Engineers
I occasionally interview candidates for software engineers. In almost all cases, these are either Java or C++ developers. For me, and also many of my colleagues, I want people who are good problem solvers, and can design and implement reusable software that are of decent quality. In my technical interviews, I look out for a number of attributes. At the very fundamental level, I expect candidates to be very familiar with basic language constructs, as well as decent knowledge in the standard libraries of that language. For Java developers, they should be very familiar with the most mature parts of the Java Collections Framework. Another important aspect I assess is knowledge in design patterns. The next area where I ask questions on is simple data structures and algorithms. Finally, depending on the amount of time available, I may also query on the candidate's knowledge on other topics, such as concurrency, databases, operating systems, and other more advanced material. My interview questions are typically pitched at the undergraduate level in a computing curriculum, with the majority of the questions at a level slightly higher than the introductory programming courses.

Here, I make several observations about the interviews I have conducted so far. Firstly, I find the lack of knowledge of basic design patterns in many candidates rather disappointing. I don't mind revealing that usually I start my interview questions by asking the candidate to list a few design patterns they know of and briefly describe them, followed by implementing the singleton design pattern in either Java or C++ (depending on the position being interviewed for). Surprisingly, not many people answered this question well. For Java developers, I found that a large proportion of candidates do not know generics, much less lambda expressions. A smaller but still significant proportion of candidates don't seem to be able to effectively leverage the Java Collections Framework to write effective code, such as to merge multiple collections into a single collection consisting of unique elements in a few lines of code. For algorithms, because the area is so broad, I tend to stick with the more standard questions, otherwise the process is pretty much a hit and miss. One thing I observed is that given the diverse backgrounds of the candidates hailing from various countries, a significant number of candidates do not know the concept of computational complexity and the big-O notation at all.

Over time, I have found that the vast majority of the candidates who applied is below my acceptable threshold, and even through I have dropped my standards significantly over the years. I have come across a large number of code monkeys who only know how to implement applications by using a web framework (usually Spring) and know very little else, but such applicants do not meet our requirements because we develop software products and not just complete projects. Also, of particular note is that people of particular region have a tendency to spam their own resumes all the place, including irrelevant positions such as Java-only expertise for C++ positions. Due to this and other reasons, I heavily down weight candidates from people of this region. Overall, good candidates are pretty hard to come by. Interestingly, our experience is that it is easier to hire good C++ developers than good Java developers, despite the popularity of the Java over C++. Perhaps the reason is that the barrier of entry for Java developers is quite low, possibly the present day analogue of the BASIC programmers of the yesteryear. Yet another thing I noticed is that there are extremely few people who can program in both Java and C++ at least on an average level. The vast majority of those who are excellent in Java hardly have any C++ skills, and those who excel in C++ often disclaim any proficiency in Java. I must be one of those very rare people who know at least something in both Java and C++.

When I conduct technical interviews, I tend to give a number of short structured questions. I like my interviews fast paced, where candidates can answer my questions quickly. This is because different candidates have very different backgrounds and come with very different strengths, and a fast paced interview allows me to ask a large number of questions covering a broad range of areas in the limited amount of time I have. At the same time, speed in answering short interview questions can imply the candidate's familiarity in the material being asked, which is a positive sign. Average candidates who speak slowly and take a long time to answer my questions are at an disadvantage in my interviews, as this limits the number of questions I can ask, and thus risk not getting their strong points uncovered. Unlike some of my colleagues, I no longer favour giving an hour long programming task. The main reason is because there is only one question, and typically it assesses a fairly narrow area while using up a significant amount of interview time. I personally found that there is a good chance that an possibly qualified candidate would have misunderstood the question, or are otherwise unfamiliar with the context of the question. Furthermore, if the problem is novel to a capable person, he or she may take far more than one hour to research on the problem, before delivering a good solution. Ultimately, I find that if the candidate performs well in the other parts of the interview, then he or she will have no problems dealing with the hour sized problem. Assessing candidates on larger sized problems can always be done during the probationary period.

One philosophy I have when conducting interviews is that they are free to use any resources when answering my interview questions. Typically I do not state it upfront, but whenever I see candidates struggling to answer a question, I often allow them to use the Internet to search for answers. In fact, I am open to the idea of opening up at least some of my interview questions to the candidate before the start of the interview, but most of my colleagues do not like this idea, and thus so far I have not done so. The thing is, the workplace is not a two hour closed-book written examination. We as software engineers google all the time, be it looking up APIs or searching for solutions to problems, and we often discuss or collaborate in our quest to find answers. To me, if the candidate had to use the Internet or ask another person to answer my interview questions, then he or she would have learned something and made himself or herself a slightly more valuable candidate. This process might even be seen as part of the candidate's problem solving skills. It is still possible to assess the candidate based on how well he or she understands the solution, and whether the candidate can explain it. Of course, any interview questions that are open will be somewhat harder compared to the questions that are only exposed during a face-to-face interview.

In a typical university, there are lecturers who wish they can simply spend 100% of their time doing research, and there are lecturers who are truly passionate in teaching. I think that NUS School of Computing is very fortunate to have a good proportion of faculty members who are really into teaching, or at least don't treat teaching as a duty that they rather not have. In fact, there is quite a sizable number of Lecturers and Senior Lecturers; these designations mean that their primary responsibility is teaching.1 Regardless of the designation, such passionate teachers that I know of include Aaron Tan, Zhao Jin, and Leong Wai Kay, just to name a few. Of course, my own advisor Kan Min-Yen is definitely included. Too bad I had not have the opportunity to take modules under any of the people I named above, except for my own advisor.

Associate Professor Ben Leong, is yet another of those passionate lecturers. To quote from his teaching page: "Research is cool, but teaching is truly meaningful." And he must have made very deep impressions in students to be a runaway winner of one ongoing popularity contest of sorts. Interestingly, Ben would write teaching statements every few years. One of these that strikes me is the 2015 statement, not so much about the teaching statement itself, but rather the afternote. I quote part of his endnote here (all emphasis mine):

I learnt something new today.

One of the reasons why I am writing these statements is that I believe in discipline and it seemed like an awfully good idea to take some time to reflect on what I have learnt periodically (and every 3 years seemed reasonable).

But I found it very difficult to write this statement. I have amended this teaching statement a number of times. But no matter how hard I tried, it never felt good enough. It always felt like the teaching statement I wrote 3 years ago was better.

But that did not make sense to me. Did I not get smarter over the last 3 years? Did I learn nothing over the last 3 years? Shouldn't I sound wiser now than before?

After agonizing over this for a couple of weeks, I had a Eureka moment during a random online conversation with a friend who is a writer: I realized that I can only activate my inner voice in my writing when I feel compelled to write.

Writing takes effort. When I write, there is always a reason. Most times I write to persuade. In these teaching statements, I write to force clarity on my thoughts and position on teaching. I don't write for fun.

The sad truth is that I really didn't feel like I had anything particularly interesting that needed to be said. I wrote this statement only because it is already December and if I did not write now, it would be more than 3 years since my last statement and that would violate my "discipline".
This reminds me of my own blog. I have maintained this blog at the current location since the first day of 2004, which is a long time.2 At that time, I was able to make posts nearly daily. As time passes, I found myself getting a lot busier, and now only write monthly posts.

There are many months where I felt like I had nothing interesting to write. As the end of the month approached, I had to rack my brains just to find a topic so that I can pen down a few paragraphs. It often takes me a substantial amount of time to organize my thoughts into something coherent, with varying levels of success in different months. All just for the sake of keeping this "discipline", so as to keep some semblance of maintaining my writing skills. Like the majority of the employed population, I spend most of my waking hours working. While my work is interesting overall, a good part of the work is mundane and very uninteresting, just trudging along and getting things done.3 And it is not unusual for me to find myself overworked and feeling very tired.4 Then as usual, a significant amount of my work is confidential, and then much of the work cannot be described without explaining a large amount of accompanying context. Besides, talking only about work is extremely boring. As a result, without something compelling for me to write, I struggle to make my monthly posts.

I do occasionally write at work, but all of them are technical documentation and hardly on anything else. Even though my boss considers my skill in technical writing to be good, sufficient for me to list this skill in my curriculum vitae, I consider my skills to be fairly average given the amount of training I received from my advisor. Additionally, I have the advantage of being English educated while many of my colleagues come from China. I told my boss, I do not write very fast - if he sees something short and understandable, then I would have expended quite a bit of time to make my writing concise and succinct, and paying attention to the flow of the entire document, otherwise he would be reading a core dump from my brain.

These days, I actually blog with no particular audience in mind. I do not care if nobody reads my blog, and I do not read the posts I made in the past either. However, I still try to write something monthly, and for over a decade I think I have successfully maintained this discipline.

1 A quick browse suggest that most of these Lecturers and Senior Lecturers are no longer active in academic research, with zero or very few recent publications.
2 Prior to 2014, I hosted another blog at my long-defunct School of Computing homepage, but I felt it more professional to decouple those emo posts from my academic front.
3 I believe this statement can be said for practically any work in any sector.
4 This is probably also a statement on my general health.

Wild Boars

Credits: Image of wild boar taken from PublicDomainPictures.net.

I made this image last evening, soon after the news of the successful rescue of all 13 Wild Boars. The result is not very satisfying though - when scaled down, the silhouettes of the wild boars don't seem to be easily recognizable.

Urinal Selection Algorithm
Proper etiquette in a male toilet includes the proper selection of a urinal.

Assume that a male toilet has one single row of N urinals, numbered 1 to N from left to right. Consider the problem of urinal selection when N gentlemen enter the toilet to use the urinals one at a time, and none of these gentlemen leave the urinals before the last gentlemen arrives at a urinal. For ease of description, further assume that N = 2^K + 1 for some positive integer K.

The urinal selection algorithm can be described as follows:
1. The first two gentlemen will select the urinals 1 and N, in any order.
2. Construct a height-balanced binary search tree whose nodes are 2, 3, 4, ..., N - 1. Consider the sequence of numbers visited by any breadth first traversal of this binary search tree. Then the remaining gentlemen will select urinals according to this sequence.

Kilauea Volcano
Kilauea volcano, located in the Big Island of Hawaii, is probably the most active volcano on the earth. The recent eruption in Lower Puna is fascinating, and one aspect stands out: the scale and speed of changes that nature can cause within a very short time frame.

Within just one month, the status quo of that lasted many years are totally altered.

Firstly, it is the ongoing Lower Puna eruption, which started initially with the opening of two dozen fissures in Leilani Estates and surrounding area. In the span of a month, the dominant fissure 8 built up a spatter cone with a constant flow into the ocean. Presently, less than two months after the eruption started, the spatter cone is 50 meters high, the flow is as wide as 300 meters at places, with an ocean entry more than 2.4 km wide. In addition, the entire Kapoho Bay was filled up, with about 1.5 square kilometers of new land created in the ocean. Unfortunately, more than 500 homes from at least three communities were destroyed in the process, as lava covered more than 24 square kilometers. The volume of the flow is very high, and currently appears to be in a stable state.

Secondly, the Puu Oo crater, which had seen lava almost continuously for 35 years, had its crater floor collapse. This collapse probably triggered the Lower Puna eruption. As of this writing, there is no longer any lava in it, and scientists estimated that lava will not return to Puu Oo soon.

The other very significant change is the Halemaumau Crater at the summit of Kilauea. After consistently having lava in the Halemaumau Crater for 200 years, the lava simply disappeared within one month with very visible side effects: first the crater floor collapsed by some hundreds of meters and became rubble filled, then the vent enlarged to encompass almost the entire crater, and finally the withdrawal of the summit lava pool resulted in extremely widespread subsidence of the summit, with large scale radial cracking and mass slumping clearly visible. A large number of explosions and shallow earthquakes have been recorded in conjunction of this mass subsidence. The Hawaiian Volcano Observatory at the caldera rim suffered damage and has been evacuated, and I have no idea what its eventual fate will be.

All of these very dramatic changes occurred just within a span one one month. I suppose these are unthinkable even as recent as two months ago. And by the way, seeing lava in person is added to my bucket list. If the fissure 8 flows continue to remain in a steady state, then Kilauea would be my choice.

Transport Infrastructure
I don't like travelling, including daily commutes to work, because this very uncomfortable activity tires me very easily. Any amount of time spent travelling is a waste of my very limited amount of energy. Probably because of this reason, I often dream of a future where people have the option to go from one location to any other location in zero or very minimal time, regardless of their locations in this world. Not possible at all, but can I dream? I suppose this makes me welcome any transport options that potentially reduce the amount of time it takes to travel, and options that open up new places. I know nature conservationists will hate me, but I really do envision having very extensive networks of roads and railways, which makes it very convenient and fast to go from place to place with many direct connections. I will very much dislike it if the Cross Island line chooses the skirting option rather than tunnelling under the Central Catchment Nature Reserve.

I find myself getting excited about projects on transport infrastructure, especially road and rail. I find myself interested in mega transport projects, both local and overseas. I was pretty awed by the high speed-rail while in Japan followed by Taiwan, and lamented that the Tohoku Shinkansen only reached Hakodate at that time, requiring painfully slow transfers as we entered Hokkaido. You can guess how I feel when the Shinkansen finally got extended to Hakodate with the prospects of reaching all the way to Sapporo in the future. The under-construction Chuo Shinkansen is similarly exciting. I was feeling so excited when the Kuala Lumpur-Singapore high speed-rail was mooted and seeing it almost coming to fruition, like, high speed-rail is finally coming to Singapore! Needless to say, the disappointment when the newly formed Malaysian government decided to cancel this high-speed rail project, although this is understandable due to the need for fiscal prudence.

The stations on the Jurong Region Line was officially announced on the day when the results of the 2018 Malaysian General Elections was announced. As a result, what is typically front page news in the next day's newspapers got pushed very much towards the middle of the newspaper. When I made this remark to one of my colleagues, his response was like, "The new MRT line is not a big news, right?" But it is to me, just like other under-construction lines like the Thomson-East Coast Line, the upcoming Cross Island Line, and potential yet to be announced lines such as the Holland line. However, I recognize that I am not as fanatic as those people who frequently post on the SkyscraperCity forums. These people are the ones who would hunt down soil investigation rigs, and use such and other publicly available information to piece together the alignment of new lines and station locations with great accuracy even before they are officially announced. Some of these people even maintain construction blogs of new MRT lines, which provide lots of interesting tidbits. On the Jurong Region Line, these people posted about the very likely and interesting three-platform layout of Bahar Junction station and how this station might work, although I do question whether this layout is the right choice without the need for a very expensive Jurong East Modification Project style modification in the future.

As an aside, the post on the Thomson-East Coast Line also revealed that when the Land Transport Authority updated its MRT system map to include the Jurong Region Line, it also unofficially added the Gardens Bay East station with a station code of TE22A. I suppose this addition is likely inadvertent. The station code of TE22A is also quite a departure from the previous practice of reserving station codes like NE2, NS12, CC18, and DT4, because no station code was reserved for this unannounced station on the Thomson-East Coast Line. I suppose the Land Transport Authority learned from the public reaction through the Buangkok "white elephant" and Hume station incidents, when it wants to provision new stations to be opened at a later time. From the completely fitted up Buangkok station, they smartened up and only built a shell at Bukit Brown. And probably because of Hume on the Downtown Line, they decided not to even reserve a station code for the Gardens Bay East on the Thomson-East Coast Line. I suppose the whole point is to make such future stations as inconspicuous to the public as possible. But such information remains publicly available when the Land Transport Authority publishes tender notices for construction projects and subsequently award them. I do hope that such shell stations do eventually open.

It is common to see such kind of provisioning in public infrastructure projects. For example, the Little India Downtown Line station construction had a short section of "future underground infrastructure" which we now know is the North-South Corridor. Interestingly, Contract N109A for the North-South Corridor also included a "future underground infrastructure" occupying more than half the length of the tunnel section, and I strongly believe that this is for the Cross Island Line, because it it already well-known that this MRT line will have interchange stations at Bright Hill and Ang Mo Kio. Other provisions were also seen at North East Line Chinatown and Circle Line MacPherson stations when they were first built, which we now know are for the Downtown Line. Some of these provisions are built for the very far future, but were eventually not used when plans changed. I suppose one of the biggest waste of provisioning can be seen at the Circle Line Promenade station, which was obviously designed as a cross-platform interchange, but the plans totally messed up after the 2004 construction accident which caused Nicoll Highway to collapse. As a result, Land Transport Authority spent another several hundreds of millions to build new Downtown Line platforms, in a very challenging construction project underpinning the existing Circle Line platforms, yet providing an arrangement that is less convenient for commuters.

Regardless, the Downtown Line Promenade station was very exciting when it opened, mainly because the depth is so mind boggling. Its depth was preceded by Bras Basah station, and is now succeeded by Bencooleen station, which incidentally is only a few minute's walking distance from Bras Basah station via Singapore Management University compounds. On the other hand, the MRT cum viaduct structure in the Tuas extension makes the stations so amazing tall, with supporting columns so frigging huge. I always look forward to new MRT lines and new stations. I have participated in most of the open houses, and look forward to the next one. Similarly, I await the completion of the North South Corridor. As much as the expressway itself, I look forward to having a good north-south cycling link that is sorely missing between Ang Mo Kio and Yishun.

Update. TE22A Gardens Bay East is removed from the MRT system map. As I predicted, they made a mistake in including an unannounced station.

New FindPython CMake modules
In the master branch of the official CMake GitLab repository, lies three new gems: FindPython, FindPython2, and FindPython3 modules. The previous way of using CMake to find the Python executables and libraries was to use the FindPythonInterp and/or FindPythonLibs modules, and these were very problematic.

Not too long ago, I had needed to embed the Python interpreter in one of my C++ projects, in order to use codes written by one of my colleagues. I needed to support Ubuntu and Windows, on some or all of x86_32, x86_64, armhf, and aarch64. For those who recognize it, the last two CPU architectures require cross-compilation. Obviously, the original FindPythonInterp and FindPythonLibs modules doesn't work. Given the deficiencies of the preceding, various authors have written a variety of other Python finding modules, such as the widely-used FindPythonLibsNew. But they don't fully work either.

Then I found the FindPython, FindPython2, and FindPython3 modules, which is a rewrite of the existing FindPythonInterp and FindPythonLibs modules. Further, techniques used in quite a few of the other alternative Python finding modules were also incorporated into the new FindPython modules. I decided to try them out, and with minor tweaks I got it working for cross-compilation on Ubuntu, and another workaround resolved an issue I had on Windows. Thus, I had it working for all the platforms I compile my project for.

Feeling charitable, I decided to file an issue on the CMake repository, and issue a merge request for the cross-compilation problem. The response from the CMake maintainers was very rapid, typically within 24 hours on a weekday for every message that I posted. With their guidance, the merge request went through their processes and was merged within one week. Along the way, I also brought out the problem I had on Windows, and the original author of the new FindPython modules found the bug and fixed it a few days later. Overall, less than two weeks. Very nice.

I very rarely file issues and pull requests on open source projects, but it does not mean I have never done so before. The response by the CMake maintainers is extremely fast, compared to a few of the other projects that I have worked with. There was this widely-used Java framework that had my pull request sitting for months before finally accepting it, and that was after repeated reminders. Then there was also this software library for a very prominent database, maintained by the same organization, that left a major bug open for so many years. I had actually analyzed the bug and identified the cause, but the organization had zero interest to fix it. Comparatively, the CMake project has the fastest response times I have ever seen.

The new and shiny FindPython, FindPython2, and FindPython3 modules is slated to appear in CMake 3.12. If you need to use CMake to integrate the Python executable or interpreter, I recommend giving it a try when it is released.

And by the way, I am hardly a Python developer. I do not have much expertise in this language. I am primarily a Java and C++ programmer.

2XU Compression Run 2018 (21 km)
Last year I took a hiatus off half marathons, and originally I didn't intend to resume half marathons. However, two of my colleagues asked me whether to join 2XU Compression Run, and I said yes. So here is my race report.

First of all, congratulations to Kohan for completing his first half marathon. His timing is respectable.

I had a great run with an expected official net time of 01:50 or thereabouts. Weather was cool throughout the race. However, the actual distance appears to be about 1 km short of a standard half marathon. If this result is officially recognized, then this will erase 6 minutes off my previous personal best under the category "half marathon, but significantly shorter than 21.1 km", which incidentally was the 2014 edition of the 2XU Compression Run. Regardless, I had this post-run high which lasted for quite a few hours, and it felt good.

For this race, I was in the 4th wave. “Computer issues” resulted in a start delay of over 42 minutes for this wave. Some parts of the route was narrow, and felt like obstacle course with all those slow runners. I started to feel energy depletion starting from about two-thirds, so at that time I slowed down a bit. However, when I overtook the 02:40 pacers I knew I have hit my target net time of 02:00. I haven’t achieved this time in any of my training runs this year. So yeah.

Energy boost was in the form of Honey Stinger Energy Gel at 7 km and 14 km. This seem to work without triggering triggering gout like some of the other energy gels. Thanks Boey for recommending honey.

Update. Official net time is 01:49:34, ranked 401 out of 7031. The splits are given as:
9 km - 44:56
11 km - 01:05:01