Coronavirus: My Personal Thoughts and Preparation

[I am not an epidemiologist/expert. These are all from my personal research, interpretation, and general preparation]

Like everyone around, I have been following the Coronavirus outbreak very closely for the last month and half. In my observation there are two school of extreme opinions out there (well there is also mass ignorance and indifference in some cases, but I am talking about the educated, well informed crowd):

covid19-growth-healthcare

1) Why are we so worried about this when flu (or smoking or car accident–insert any large number of death event) kills so many people anyway

2) PANIC. WE NEED TO MOVE TO THE MOUNTAINS…

While panic is the wrong solution, so is indifference–we should try to gather all the information and look at the whole situation from the first principle. The best thing we can do is somehow reduce the growth rate of the virus so that a) we do not shock load our already frail healthcare system b) we buy time to get to a vaccine c) get to a point so that the so called growth factor of the disease growth plateaus out (an excellent video on this here).

Facts First

First let’s discuss the facts that we know from competent sources:

  1. COVID-19 is way more fatal than normal flu and if you just looked at the history, the 1918 Spanish flu pandemic infected 27% of the world population and killed ~50 million (it killed 20M+ in India alone). The global movement rate at that time was a fraction of that of the present day. And if we look at the context from historical pandemics, it looks closer to the Spanish Flu than the general Flu or even Swine Flu.

Source: Boston Consulting Group Report

2. The death rate for older people/people with pre-existing conditions is much higher than that of the young crowd. Hence the young outgoing crowd can easily become the main spreading vector and infect the older population.

covid-19-death

3. ANY healthcare system has its limitations and is not designed to support a high number of people being sick at the same time. An exponential growth of a pandemic unchecked can destroy a healthcare infrastructure by shock loading it.

covid19-growth-healthcare

4. Vaccine is at least a year away and if unchecked virus grows exponentially (in layman’s term–takes just weeks to 10x)

5. The virus primarily spreads from contact with infected people OR surfaces that contacted the infected person–the primary source is water droplets and not air.

 

Preparation

With all of these facts in mind, from the first principle, these are following steps I am personally taking (let me also acknowledge that I am incredibly  privileged to be able to do a lot of these):

infographic-coronavirus

Source: John Hopkins Infographics (best ways to protect)

  1. The most obvious ones: a) religiously wash hands before and after entering home and office, eating food b) use proper hand washing technique c) wipe phone, laptop, and keys with disinfectants d) wipe desks and other regularly used surfaces e) try to not touch eye, face (pretty bad at this) f) separate out outdoor clothes from indoor ones

  2. Work from home: unless need to be in the office for some specific meetings/interviews, plan to work from home for the next few weeks

  3. Reduce social contact–I have decided to stop going to parties and large gatherings (history shows that social distancing works).

  4. I have stopped using public transportation and have been using the Lyft ebikes to commute to work (disinfect the bike handles)

  5. Prepare for a shock in the supply chain/reduce my exposure to groceries: While shock in the supply chain is unlikely (even in Wuhan they had super markets open), Groceries are probably the worst place to be. So I have gotten dry food which would probably last me about a month. Nothing fancy–rice, lentil, beans, pasta, can of soup, frozen meat, fish etc.

  6. Taking good nutrition, multivitamin supplement, proper rest, and drinking a lot of water.

  7. Talk to older community and family members and try to find a safe solution for them

coronavirus-preparation

Source: My personal pantry stocks (dry food, frozen food, camping food)

The main benefit of this is that these efforts will be that we will not shock load the healthcare system and buy us time and hopefully this way we would be able to flatten out the growth curve and after reaching an inflection point.

inflection-covid19

Source: Exponential growth and epidemics

Context of Bangladesh:

Bangladesh has just announced 3 active cases of Cornavirus as of March 8, 2020. Being an extremely densely populated country with third world infrastructure makes it really scary, and some of my friends and family back home are justifiably afraid. We have already seen what the shock loading of infrastructure looks like–markets are already out of handwash and maps.

However, I think the same principle applies to Bangladesh. In addition, one saving grace (not fully scientifically tested, but highly likely) is that given high temperature and humidity of Bangladesh, Coronavirus will most likely survive outside of the host body for a very small time (minute-hours vs days for colder weather: John Hopkins University did an excellent study on this). However, given the high population density and general bad hygiene in the masses, it is still extremely dangerous–especially given bad air quality people are very prone to respiratory diseases. I am particularly worried about older friends and family members in Bangladesh.

Posted in Life, Uncategorized

Open Road: Lessons from a Month Long Europe Trip

As I caught flu right at the end of my month long trip of Europe and decided to take a few rest days, I got some time not only to process this trip but also to reflect on some of the key experiences. While I still have a lot to process (and a lot to write about many of the crazy experiences), here are some of the key insights:

chamonix ski mont blanc

Things will go wrong, but that is okay:

The day plan you made will not work out because there is a huge line of tourists in front of you, you won’t be able to ski all of the days because lifts will be closed due to a storm, or heck you will get sick and need to stop everything to take care of yourself. All of these are okay and must be expected because like everything in life things will go wrong and you need to plan around them.

switzerland zermatt hiking ski

You will have good and bad days, but ultimately your happiness depends on yourself:

This is also a general philosophical point, but an important point to remember while traveling–happiness depends on the mental framework more than anything. As it is famously quoted in psychology: “happiness is the difference between your expectation and reality”, so learning how to fine tune expectation in accordance with reality is something that is hard to master, but worth trying to learn to make the whole experience more pleasurable.

Despite a packed schedule, stop and soak in the experience:

You will always have more things to do and more things to see–especially when you are trying to visit 17 cities in 30 days. However, it is important to stop for a minute to soak in that incredible art piece by Dali and read the Wikipedia article on it. Similarly, it is important to stop for a second to appreciate the history and the vibe of a place.

dali time art prague

Be radically self-reliant:

This might sound like a burning man motto (and it is!), but what I mean by this is making sure you have a plan for the worst and you can execute that without a lot of outside help. That could mean things such as writing down emergency contact numbers on a piece of paper (comes handy when you lose your phone!), learning how to navigate unknown city both in digital and analog ways (how do you get back if you lose your phone?), bringing your satellite communicator on the ski slope, and most importantly–learning to take very good care of yourself and your body.

cervinia italy ski matterhorn

 BUT aggressively ask for help from strangers:

It is really reassuring to experience how much good we have around us and how far strangers are willing to go to help you. It could be a small thing such as asking a stranger how to validate the metro card or crazy moments such as losing your phone on the mountains around Swiss-Italian border(!), despite not being able to speak English well (or at all), most of the strangers will try their best to help you.

Communicate with your travel partners:

Traveling solo and traveling with partners are completely different beasts. But if you have someone around, communicating even the smallest details is important. It could be as trivial as you have an ingrown nail or that you are not feeling very well. Tell them, maybe they can help! Constantly communicate your situation and expectations. That will make everyone involved happier and saner.

Make a checklist of basic lifelines and DON’T lose them:

Given I had my whole ski gear and camera gears with me, keeping track of a lot of these things would have been very overwhelming. So instead of trying to safeguard everything (and in the process driving myself insane), I made a basic checklist of three things I must not lose (passport, wallet, phone) and constantly kept track of them. Rest were replaceable.

Remember it is a vacation:

At the end of the day, don’t stress about waking up exactly at 8 am instead of 8:20. Nobody will die if you are 20 minutes late to start your day or if you miss one “must see” attraction. So don’t kill yourself and try to enjoy the whole experience as much as you can.

It is not that I have always been good at following these, but these are some of the key takeaways I will remember on my next trips to make them better.

Posted in Uncategorized

Bangla Pi: Is Affordable Computing the Silver Bullet to Development?

 (A version of this article was published at the Conference for Asian Countries on Digital Government)

“Look, this cat is moving!”, Duti, a 12 years old girl from Hajipur girls’ school exclaimed. Her eyes fixated on the 10 inch LCD screen of a bizarre device with wooden frame labeled “Bangla Pi” that clearly was concealing a prototype. And standing five feet away from her, the incredible feat to enable a girl from the one of the most remote villages of Bangladesh to program a computer struck me.  This village girl just animated a cat with the programming language scratch, developed in the MIT Media Lab.

bangla_pi_kanihati

To understand the story behind this rather extraordinary scene, we have to jump back six months in time to my dorm room at Harvard. One day, while working with raspberry pi, a credit card sized computer that is sold for 35 USD, the idea of using a similar hardware architecture as a medium to affordable computing and access to information crossed my mind. As I dug deeper, I realized how far the semiconductor industry has come. By Moore’s law, we have already seen how computing power has doubled every 18 months over the past few decades. As a result, the cellphone in my pocket has a processor with more computing power than the computers that launched Apollo mission to the Moon!

The basic idea behind this affordable computing tool was to use the smart-phone processors (which can be bought for as little as 3 USD in the chinese market). Using this processors along with open source operating systems such as Debian or Ubuntu, we can simulate a desktop computer like experience. I used the cheapest processors and put it together with LCD panels to make a device that can do everything a typical computer can do, but for a much lower price. I named this device Bangla Pi.

I went back to Bangladesh during my winter break of senior year (2015) on a fellowship from Harvard South Asia Institute and made 20 of these prototypes. I bought electronics parts from China and spent a long week making those devices. Because I did not not have access to a 3D printer (would not be cost effective anyways), I just went to local photo frame makers to custom build wooden casing for them. These bizarre devices could function exactly the same way as typical desktop/laptop computers and had USB ports for a mouse and a keyboard. It was also possible (after a lot of hacking) to connect wifi modules to the devices to support connectivity. It cost me ~65USD to make each of these devices and the LCD panels were the most expensive parts (~40 USD each). However, talking to some of the chinese manufacturers I found that at scale these price can be much lower.

 bangla_pi

With these devices, I ran a few pilot project in Dhaka and a few very remote villages in Bangladesh. The results I found were amazing–the students (who were between 12-18 years old high school students) seemed to pick up these skills very, very quickly. I was amazed how some of them, without having used computer before, made computer games with the programming language Scratch, an interactive, easy to use, graphical programming language that enables young students to program by moving small code blocks. The most important lesson that I learned from this experience was that we can enable natural learning with similar devices and with connectivity we can empower students learn anything they wanted to learn about.

Now that we have a very affordable computing platform that promises to deliver connectivity and computing power, should we just distribute these devices en-masse? While I truly believe in the potential of Bangla Pi and similar affordable computing platforms, I think answer is more complex than a simple yes or no. Although it takes a bit of naiveté and a leap of faith to do something as crazy as connecting the world where many other challenges are presumably of higher importance today, I would argue that this naiveté even part of highly acclaimed and widely distributed devices such as one laptop per child (OLPC) by MIT Media Lab founder Professor Nicholas Negroponte. While his idea of sub 100 dollar computer was bold, the most important thing OLPC missed missed were the right context and the ability to emulate an operating system similar to a standard desktop PC. First of all, OLPC lacked a clear goal as an use case, whereas for Bangla the goal was clear–enable learning how to operate computers and incorporate it to a standard curricula.

Another important aspect that many of the technopreneurs forget is the importance of training and customer service. We might deliver computers to every school (with enough funding) but if we do not train the teachers and cannot make sure that the devices will be functional after a few months, then we are probably introducing an overhead instead of helping the educators. For this very reason, we designed Bangla Pi as modular units. There are five different units in Bangla Pi–processing unit, power unit, display units, input unit (mouse/keyboard) and each of these individual units (except the display, which is fairly durable) costs under 10 USD and is very easy to replace in a plug and play fashion. So we can create a 10% redundancy with spare parts (example–in a school with hundred computers) and easily make sure that all of the units are working fully.

I have studied many literatures and books on the topic of computing in developing countries. One that truly grabbed my attention was The Geek Heresy by Kentaro Toyama, a fellow Harvard alum and a former Microsoft Research fellow in India (and current professor at the University of  Michigan). In his book, he brought up many important points regarding how many of these silver bullet solutions for development with technology have failed. And from his experience through working in India at Microsoft research, he also explains the nature of some of the projects that truly succeeded. While it is beyond the scope of this article, I would like to talk about the main point of the book which is that technology primarily amplifies human forces, but if there is no force existent today no amount of technology will help. The same philosophy was apparent in Microsoft’s founder Bill Gates’ book The Road Ahead, in which he said “The first rule of any technology used in business is that automation applied to an efficient operation will magnify the efficiency. [...] Automation applied to an inefficient operation will magnify the inefficiency”. Therefore, the primary point that we need to address before we just go ahead a make all the schools digital is that we need to make sure our human force (i.e. the teachers) are at a state where they can take part in this amplification process.

Now the question stands—in the context of Bangladesh and similar developing countries, how can we leverage the technologies such as Bangla Pi to amplify the human capabilities? At the risk of sounding cliché, we need to ask how we can empower the people with these technologies and make sure they have access to the information and services that they need to improve their quality of life. I personally believe that the possibilities are endless. Let me start with a few possible key game changers:

1.     Improving education with human augmented technology: We could develop a centralized digital pedagogical system where typical hour long teaching method would be augmented by a 15 minutes of visual contents. This could both take a bit of teaching burden off of the teachers’ shoulders and make learning much more interesting for the students.

2.     Access to better health information and services: We could empower the current network of health workers to be able to connect with the people in the villages better via technology products such as Bangla Pi. Moreover, they would be able to direct the people on where to find the best healthcare by providing information via these devices.

3.     Marketplace for the farmers: Without getting into details, a major shift in the current marketplace could be achieved if we could eliminate the information asymmetry that exists in the agri-marketplace. This information asymmetry only helps the middlemen and harms the farmers and end consumers.

For all these problems to be solved, we need to acknowledge the existing solutions and see how technology based solutions using connected devices such as Bangla Pi could help us solve these problems by amplifying the existing human efforts. While a very strong supporter and believer of technology, I do not think that technology alone can solve all our problems (maybe in 20 years we will have strong AI (artificial intelligence) in place and it will be a different story, but not today). However, technology augmented with human power can amplify the efficiency of the human force by manifold and as a result can fundamentally change the way billions of poor people live. Just imagine a village in Bangladesh, where the farmers can get information from the agriculture within a few seconds and can get health information when his wife is pregnant. Imagine a world where his kids go to school and get education augmented by thought provoking visuals and proven pedagogical guidelines. Imagine a world where that very farmer can auction his crops at the best price in a marketplace where information asymmetry does not exist. Imagine a world where the power of the government is distributed to the masses because everyone is well informed. That future, enabled with technology and connectivity to vastly amplify and truly empower the human efforts is the future I dream of everyday.

Posted in Education, Programming Tagged , |

How to get into Harvard, MIT or X*?

*insert the name of a prestigious US university

Short answer

For US admissions you need—1) Academics (SSC/HSC/A/O level) 2) Tests (SAT, SAT II, TOEFL) 3) Extracurricular Activities (ECA) 4) A Story

But if you stop there, you’ll never get into Harvard/MIT! As we get this question on our facebook group ALL THE TIME, I just wanted to address this.

Annengerg-Hall-Harvard-College-Dining-Cambridge-Buzzing

Why Harvard/MIT?

First, why do you want to get into Harvard/MIT? This is a question that you must ask yourself because Harvard might not be a fit for you and you’d be unhappy. Education quality-wise, many US top colleges will have similar quality of education, but the networks and other opportunities make these prestigious places, well…so prestigious.

 

How do they select?

First, let me tell you how places like Harvard/MIT build their “classes”. They take people from different fields and try to find the promising ones (or already young stars) in their fields. So if you are in the 12th grade and have nothing to show that you are that great in something, maybe you have no chance of getting into those—I will be honest. However, if you are in grade 8, 9, or even 10, you might have time to invest in something, improve yourself, and achieve greatness. Do something that really interests you, be really good at it, and make it your story. So what is your story?

The reason admission into any US university is hard is because they take a holistic approach—they look at all your scores and extracurricular activities and try to find how you used your time and what your caliber is. For example, if you judge a kid from a village in Africa with a kid from a rich family in the US, you are not being fair. The US admissions people know that.

 

My Story

Let me tell you my story, I went to a school in a small city called Kushtia. There we did not even have a debate club. So I was one of the members who started that. This is the kind of initiative and passion that show your true passion and quality. I was very lucky to make into the national math team several times and won a bronze medal at the International Math Olympiad (IMO). These kind of international honors help a lot. But look—to get there I used to do Olympiad math for 5 hours average (sometime 15-18 hours before IMO) for over 6 years. That is more than 10,000 hours of work. Do anything like that in any field and you will become so good that places like Harvard would love to take you. But are you willing to spend 10,000 hours?

 

Math Olympiad and Cloning

There was a very interesting post on MIT admissions website that says that if you follow someone who got into MIT and become exactly like that you might as well get rejected because “cloning is still for sheeps”!

“Some applicants struggle to turn themselves into clones of the “ideal” MIT student – you know, the one who gets triple 800s on the SAT. Fortunately, cloning is still for sheep. What we really want to see on your application is you being you – pursuing the things you love, growing, changing, taking risks, learning from your mistakes, all in your own distinctive way.” (http://mitadmissions.org/apply/prepare/highschool )

So just because I got into MIT/Harvard with Math Olympiads, that does not mean that you’d too. I had a lot other things that I did, but more importantly—I had an interesting story to tell relating to all these. It is not even true that just Math Olympiad kids got into Harvard/MIT/ other prestigious schools. People who won international debate contests, made a difference with their social work also got into those colleges. But see one similarity? They were all passionate about what they did and did something really well along with being good in academics and other stuff. So to make yourself stand out, do well in something you really love. That would make you happy in the long run.

Harvard-College-Cambridge-MA-Ultimate-Education

Finally…

Please don’t ask me “I have X in SAT, have done Y ECA, have Z gpa. Will I get into Harvard?” I can’t answer your question—because everyone has a different story to tell and that’s why in the essay you write your story. If a kid from a village in Bangladesh started a business that helped a lot of people that would be more impressive if the kid of bill gates did the same, right?

Finally, nobody is sure to get admission to MIT/Harvard. Historically, Harvard rejected many IMO gold medalists and even IMO perfect scorers. Because they were not a good fit or had severe lacking in some of those 4 fields. So instead of trying to get into Harvard/MIT try to improve yourself and be world class at something. Then you’ll realize that it doesn’t matter whether you got into Harvard or not, but the path you took to become a master is what matters in life.

 

p.s. If you have more questions:

1. Jon our group: Bangladeshis Beyond Border: Undergrad Admission Info Portal

2. READ the files in the file section.

3. For more info about Ivy league admissions read this excellent blogpost by Nazia.

Posted in Admissions, Education, English, Math Olympiad Tagged , , , |

Project Bangla Pi: Why I Care about Affordable Computing

It is ironic that similar to the way computers understand the world with compiled instructions in binaries (0s and 1s), they have divided the world into a binary group system– the group that knows how to use computers and the one that does not.

Most of the supporters of computer education these days start their pitch with teaching everyone coding and claiming that everyone will be computer scientists. While I would love that future, I feel that this pitch is coming from a first world perspective where most of the kids already owns an expensive computer and know about computers and smartphones from an early age. With this we are majorly discounting the rest of the world, where many kids have never seen computers.

bangla_pi2

Currently the world literacy rate is 84.1% (2013), but the digital literacy rate is much lower than that. With more and more computers being used every day from tasks as simples as writing emails or applying for jobs over internet to more complex tasks such as writing programs or using computers in factories, I don’t think I need to convince anyone that we need to teach young kids how to use computers. These basic computer skills are becoming as important as learning how to read and write. For that purpose we need computers accessible to all kids.

There is no doubt that the world needs more quality educational materials. Khan Academy clearly showed us how we can empower individual students if we just give them access to quality educational contents. The kids can learn themselves. Especially in many parts of the world getting a quality education is hard because of the lack of good teachers. While I don’t believe that MOOCs (online video lectures) can be as effective as direct classroom education and solve all of our problems, I believe augmenting classroom study contents could be a great use. For example, when a kid learns about history we can reinforce the learning experience with some multimedia contents. I believe flipped classroom model could be hugely successful if we can find a channel to deliver those contents to the students.

bangla_pi3

Finally I’d get back to the coding part. The kids who are already using computer for years, they should really learn to code or understand coding. They might not need to write thousands of lines of codes in future, but in this age of machine learning and massive automation, learning to code is going to be an invaluable skill for understanding the world. For example at Harvard, less than hundred students major in computer science every year, but this year 889 Harvard students took CS50, the legendary introduction to computer science class! This clearly shows why you should learn about computer science if you are already privileged to have computers.

The reason for this long and (hopefully) obvious discussion about the necessity of computers is that if we want to solve the problems of the world through better education, using computing devices is necessary. If I use the terms of Economics, computer is not a luxury good anymore– it is a necessity good for everyone. But with the average price of a good laptop being about a thousand dollar, buying computers might be a bit hard even for some people from the first world country. Now consider the countries like Bangladesh, where the average annual income per capita is 1044 USD, buying a computer for their kids is next to impossible for most of the families.

 

bpi_logo

 

As a naive young college student, I have been thinking about this digital divide issues for a while. Last semester, when I was playing with a raspberry pi at my dorm at Harvard, I was thinking why we cannot just use these and some cheap LCD panels to make some small computers for the young students. For those who do not know what this is, Raspberry Pi is a computer board with 900 mhz (overcloked) CPU, 512MB ram, and 8 GB disk space. Basically this is equavalent to the computing power of a mid range smartphone or tablet. But do we really need that much computing power for teaching kids? It turns out–not really.

So I decided to do this over the winter break and got 15 raspberry pi boards and imported some cheap lcd panels from China (getting them cleared from customs was hard, but that is a story left for another day!). Then I scouted all over Dhaka to find cheap peripherals such as keyboards, mouse, and micro sd cards. With all these parts and many hours of labor (plus wooden frame made by the local photo frame makers), I finished assembling 15 devices and custom built a power supply system for all of them.

bangla_pi1

While assembling these was nontrivial, making the OS ready was another challenge. I have been working on using different flavors of debian for a while, but running the OS on the ARM chip is a bit hard given the resource constraints. So I had to do a bit of modification of the debian based raspbian OS and had to make sure we have everything in the Bangla Pi OS that I distributed with the 15 prototypes. The great thing is that you can do almost anything you can do with a typical computer. While this is does not have a lot of CPU power, so far they proved them to be adequate for most of the educational purposes.

So this winter I am running three pilot projects to see what we can do with these devices. We are running these workshops in three places. One in Dhaka (where all these kids have been using computers for years), another in Pakundia, a upozilla (sub-district) in Kishoregonj, and finally one in a small village of Sylhet. I am just trying to see how all these kids with very different computer literacy level interact with computers and how we can use these devices to improve education for them. I am teaching Python programming to the students in Dhaka and for the other students I would limit it to Scratch, which is a visual programming tool developed by MIT media lab.

bangla_pi4

It would be an understatement to say that organizing these has been just difficult. Thanks to my mom, and volunteers, who have been working tirelessly to help organize these workshops and create the curriculum. Also thanks to Harvard South Asia Institute for their winter grant that enabled me to do this project.

Currently the computers cost about $85 to make, but if we could mass produce them (at least couple of thousands), it is very possible to drive down to the cost with the new technology we have in R&D. This could be a great step toward making computers accessible for all. For the poor this could be a great first computer and for the privileged ones this could be a programming sandbox.

I believe that some people are better than others in some areas because of their inherent ability to excel in that. I also believe that these talented people are fairly randomly distributed regardless of geolocation or income level. So with affordable computing for everyone if we could get rid of the digital divide and make education a level playing field for the kids from the first world to those from the third world, we would be the best talents from all over the world to solve our important problem. Then we will be one step closer to having the world we all want to live in.

Posted in Education, English, Programming, Python Tagged , |

CS 165 or: How I learned to stop worrying about speed and Love the column stores

This Spring 2014 semester, I took CS 165, a course on big data database systems by Professor Stratos Idreos. This course was offered after quite a few years; so there were quite a few of us interested. However, for myself it was a course that I have been waiting for a few years to take. As I am very interested in data science and related topics, how database works under the hood is one of the basic principles I need to know before I can do any large scale analytics efficiently. Especially the slide from the first lecture sums up the necessity of learning about DBs perfectly:

slide

When I took the course my expectation was that this would probably be a course where we’d learn about SQL and hadoop/hive for big data. However, when I went to the office hour of Professor Idreos for the first time he said that “you don’t come to Harvard to learn about [how to code in] SQL!” So as a column storage expert, he decided to teach us about the database system in the lights of the column store paradigm.

row-store-v-column-store

Column stores are different than the regular SQL like row stores. The store data in columns instead of rows. So when we query data the row store has to go through the whole row, whereas in column store we can just work with the specific column(s) we are interested in. This enables massive performance benefits in many cases, including scaling of the the database. I will talk about that in details in a bit.

I was excited and terrified at the same time, because having not taken any systems course (other than CS50, which does not really count) I knew that it would be really hard to make a whole working database system from scratch in C. However, I decided to stick with it. It was not easy, but I am glad I did! I feel this is one of the classes at Harvard where I learned the most. I learned how data systems work under the hood. I learned how to maintain a codebase with about 5000 lines of code. I truly learned how to use pointers and memory management. Finally I learned about low level CPU, GPU architectures and how to leverage the CPU caches to write cache conscious code. Beside these in terms I learned a lot of lessons that would pass as the big picture-take-away from the course:

Lesson 1: The real world is messy

As cliche as it might sound, the real world is too messy. It is easy to say “oh, design a database in a distributed fashion, when you have a server and client and just make the server and client send each other streams.” However, when we implement that it is a pain, to say the least. Moreover, when we get the command/data we need to parse that. Parsing strings in C is honestly no fun…or maybe I am too spoilt by Python.

 cs165_web

Lesson 2: Sockets are neat, but painful…aargh

We have a lot of ports in our computers which we can use to communicate between computers in a network. As we had to make things distributed (i.e. many clients in a network can talk to the server) we had to make sure that the communication is actually happening. It is slightly nontrivial because at a time one computer should send message and another should listen. However, if both of them send or both of them listen, like real life, nothing productive will happen! Also the second thing I realized was that when you program your node to send it does not necessarily send the message. For example if you send 8 bytes to server, the client might not send that immediately, rather wait for more message and buffer to optimize the communication. This is a really neat and smart thing, but that means you have no control! So that kind of sucks. To get around that I had to code up a whole streaming protocol that does it correctly and that was the bane of my Spring break!

Lesson 3: B+tree…oh the black magic

If you have done any amount of CS (given you have read so far I am assuming that you have), you must have heard of the binary trees. If you had the opportunity to implement one for the first time, you probably were not the happiest person in the world. Now take that binary tree and make that more general (i.e. each node can have arbitrarily many keys). As you can probably understand, it is pretty complicated to implement a B+tree. So it was just solid 500+ lines of uncommented C code!

On the plus side, there are great advantages of B+tree. First of all, due to the structure your leafs’ keys are always sorted and you always get a balanced tree! When I saw it the first time I thought that it is black magic! However, later I realized (doing some complexity analysis) that the cost for in-memory B+tree is the same as binary tree. However, the I/O cost is significantly lower. So if you have a disk based implementation for really huge database, then it would be strictly better to use a B+tree with high order (lot of leafs).

Lesson 4: It takes a lot of hack to make something work

It is easy to throw buzzwords like distributed system, cache conscious, threaded, parallel, and so on, but it is really hard to make it work. It takes a lot of effort to make a system work and the learning curve is steep.

Especially when I was dealing with threading (to make commands from all the clients run in parallel), I realized how messed up things can get with each thread doing their own thing in parallel. And my final version is not really thread safe, but at least I understand the concept of race conditions, thread locks etc.

One hack I was particularly proud of was lazy fetching for joins. So when we make a selection and fetch, we generally have to go through different parts of the column 1+1=2 times. However, lazy fetching does not do anything in the fetch state and waits for that to happen in the join state. So I just attached the pointer to the original column (casted void!) for that and it worked really well. So in general my join would save quite a bit of work for this lazy fetching.

Lesson 5: Making things cache conscious is hard

When we have any kind of data it moves from memory (RAM) to the L1 cache of the processor which is just next to the layer of registers in a processor. So ideally you want all your data to be in L1 cache, because it takes many times more time to get data from disk/RAM compared to the L1 cache. However, L1 cache is small (something like 32KB for a core, depending on your CPU). So the best thing you can do is that to make sure when you do any computation, you don’t need to push the memory back and forth (because the CPU would push the memory back to L1 cache if it is not actively using it). So we need to make sure when we do computation with a chunk of data we do everything possible with it. This is basically the idea of cache consciousness and it is pretty hard to do this kind of optimization. So I ended up writing code like the following for loop join!

cache_conscious

Lesson 6: Join is fascinating!

Ii is said that the most important thing invented in computer science is hashing. Although it is arguable, I know after this course that the claim is probably true. When you do join, the most naive approach is to go through each element of the second column for each element of of the first. This is clearly O(n^2). But we can do better, and sort both and do a merge sort join, which has cost O(n log n). But we can clearly do much better with hashing. We just create a hash table for the smaller column on the fly and then probe it with the elements of the bigger column and this brings it down to O(n)– the holy grail! This works really well in most cases and the final result of hash join is no less than impressive! Maybe I will write another blog explaining the joins.

Random knowledge:

1. In C you have to truly understand what pointer and memory (i.e. the “stuff” the pointer refers to) are. Without much systems knowledge I screwed up so many times! So what I did once was when I was passing the name of columns I read from files, I did not copy them. So it was basically that I was passing the pointers to the string. So every-time the column name got rewritten and it took me a good amount of time to figure that out.

2. void * pointer is probably the best thing about C (or maybe not…). One of the reason I loved python was you can make your functions polymorphic and deal with anything. Now with the void pointer although you might have to consider cases for different datatype, you can pass different things in a single struct! Which makes life really, really easy. For example, I had two major data structs  bigarr for array based columns and node for B+tree. So in my column struct I could pass either of them by casting the pointer to a void pointer! Magic!

 

3. i++ vs ++i: This is most likely a very silly thing, but as we learned about the importance of writing tight loops writing something the following is necessary.

Although I don’t really know how much that helps in terms of reducing functional stack overhead, but it is cool and compact at least! So the difference between i++ and ++i is that the former returns i and makes i=i+i, while the later returns i+i and makes i=i+1. Fancy, eh?

Performance study and final remarks:

Cumulatively I probably spent 200-300 hours on this project over the last the last three months. So it would be really disappointing to see a sub-par result. However, the column storage did not disappoint me! Because we just deal with at most 4 columns at a time, the joins in column store scales unbelievably well. The following graph shows the performance when we had N columns with 100,000 rows each and we see that when N increases PostgreSQL the most advanced row based store fails so hard! Look at this sweet graph!!!

perf

CS165 was more of an experience than a class. As I haven’t taken CS 161, I can’t comment on the comparative difficulty of these two, but hey at least I can brag about writing about 5000 lines of code and making my own database system that kicks the ass of row storage!

[I plan to keep deving the DB in future. The refactored repo can be found here: https://github.com/tmoon/TarikDB/]

Posted in Education, English, Programming Tagged , , |

Turkish Diary: Part 1

I decided to visit Turkey almost on a whim. I was supposed to return at eBay during the winter, but the US authority did not process the work authorization in time—the bliss of being an international! So at the New Year eve, I decided that I should make the best use of the free time I have now and should travel somewhere. Although that meant being almost broke, but I also realized that in near future finding time will be more of an issue than funding any reasonable travel. So more or less randomly, I chose Turkey. Actually now that I thought about it I realized it might seem like a random decision, but it was a not so surprising! In terms of practical reasons, Turkey had pretty cheap plane tickets and they have the fastest visa processing system for any country in Europe (and yes! I do need visa unlike the US citizens). So speaking of the bang per buck, Turkey was a no-brainer!

View of Istanbul from the Galata Bridge

View of Istanbul from the Galata Bridge

Then the second reason, which was probably playing a role in my subconscious, was that Turkey being one of the leaders of the Muslim world has a lot of cultural influence in the subcontinent. Only after visiting Turkey I realized that so much of the Muslim cultural elements—from Sufism to names—in Bangladesh was from Turkey or was influenced by the Turks. This makes sense because when the Turks marched towards India the brought those Ottoman cultures with them (and yes! Delhi was under Ottoman rule at some point!). So it was good to get back to the roots of those cultural elements. Also my father visited Turkey almost 20 years ago and in my childhood I heard numerous stories about those tours. So growing up (or almost growing up!) I had to see what the fuss is all about! It was a rite of passage, if you will, or just another family thing!

Because I would go to even Alaska if I needed for cheap tickets, I got flights to Turkey from the Newark airport! Newark is somewhere in New Jersey. But to get there I decided to use public transportation (because paying 100 dollars for a taxi would defeat the purpose of the cheap tickets). Once more in my life, I realized that digital map and navigation system is the best thing invented by human beings after fire! Basically using google map, I hopped from one bus to another and was in Newark in an hour!

Gloomy Munich

Gloomy Munich

I was in Munich in 8 hours and the weather looked surprisingly gloomy. However, when I landed in Istanbul Ataturk airport the 13 C (55 F) air welcomed me! Compared to the super sub-zero weather and lack of sun-light I have been experiencing in the New York City in the last month, this seemed to be a tropical paradise! However, later I learned from the hotel people that the temperature is supposed to be around 0 C. But hey! I am not complaining!

In Boston, we often feel proud for having a few hundred years old buildings and going to a school which is about 400 years old, but in Istanbul things generally start with a 500 to thousand years count! On my way to Taksim from the airport, I saw the walls from the Byzantine Empire, which were used to protect the city thousands of years ago. And still those are standing tall—talk about the engineering!

Public Transpiration in Istanbul

Public Transportation in Istanbul

One great thing about Istanbul is that the city has great public transportation system. In fact, they have one of the oldest public transportation systems in the world and it is still operational. So I took a subway from Taksim to my hotel. When I was going up, I realized how deep the subway was! Later an internet search revealed that on average the subways are about 20 meters deep, because there are so many old artifacts under the ground that when the risk of destroying those if they don’t dig that deep for their subways.

(To be continued…)

Posted in Life, Photography, Travel Tagged , , |

Play with Python (Part 3): Magic Starts with Variables

(After the previous part: Part 2: Preparing for the Magic)

Summary: Learn about variables, types in python, type changing, input methods, and how to impress friends!

Variables:

Now you can open code in a fancy text editor like a real programmer and show your friends the nicely color coded texts. But how do we to write programs with more than one lines? For that we need to learn the concept of variables.

A variable is nothing but a fancy, geeky name for a box with a label. You can put values (and/or objects) in it and label it by a variable name. It is a good way to keep track of all the numbers you are going to use in a program. These things are called variables because you can vary (change) them anytime. Before I bore you, let’s type up some real examples in IDLE:

Now these a, b each are called a variable. Now with that assignment what computer has done is that is has created two boxes (with labels a, b) in its memory. Then it put the numbers in the box. (One great thing about python, you don’t need to tell the computer about the size of your box. Python is so smart that it can figure out itself!)

3-var-boxBy the way, in python you can write variables with a mix of letters, numbers, and underscore, but no spaces (i.e. num_3 is a valid variable name, but num 3 is not); so they don’t necessarily have to be just one letter. In fact, in future we will try to write meaningful variable names so that the code is easy to read and understand.

Now let’s see what happens when you assign variables to another variable.

So you can see how python copied the value of another variable to b. Then after you changed a, b had the old value of a. (Oh and you can print variables separating with comma. It even adds a space. How nice of python!)

Changing is the value of one variable using the value of itself is a very powerful idea. We will use it so much that python has a shortcut for it! Try (also might want to use print to see values, if you are using text editors!):

More on Commenting:

You already know that we can write one line of comment with a hashtag sign and python ignores when it runs the code. But what if we want to write multiline comments? There is a way too! For that you need to put all the contents within a pair of three quotation marks.

For examples:

Integers and Floats (Data Types):

In IDLE command prompt just type  3 / 2  and see what you get! You get 1, but it should be 1.5, right? What is wrong?
It turns out that because both 3, 2 are integers, when you ask Python to divide it does integer division. So the result is just the quotient (and of course the remainder is 1). If you know some arithmetic then you probably learned at some point that

3-division

Now type:

I now it is clear how the whole thing works for integer math.

So in python there are two types of numbers

  1. Integers: Known as type ‘int’. Type on IDLE: type(2)
  2. Floats: Floating points. The type of numbers are decimal number with ‘points’. For example: 2.0 is a float. Type: type(2.0)

(There is another called ‘long’ type, but it is not really important unless you are working with really large numbers.)

While there are many good reasons why python (and most other languages) does integer division, you can see that the result can make our calculations wrong (for example if it is a bank, it just did not calculate a dollar when dividing–the numbers surely add up!). So the way to do it correctly is to make sure when dividing at least one of the numbers are ‘float’. We can do it in two ways:

  1. Type them as float: Set both a = 2.0 , b = 3.0 or at least one of them with that .0.  (Try all the combinations yourself!)
  2. Or change them to float using float() function: try using  a = float(a) .

The second way is the standard way to do it for long programs because it has hard to make sure that everything we entered are floats. This method also brings us to data-type changing also known as typecasting. So you can interchange between int and float using int(), float() functions. Try:

 Screen input and output:

We have already learned how to output something to screen with  print  function, but you may ask how to enter data to a program. This is done by  raw_input()  function. This is the way the function works:

raw_input('optional: The text you want to show')

Now try:

So the raw input function shows the query text and gives a text object (in fact it is called a string and we will learn about it in the next part).

Now try:

 

Oops! Something is wrong. Python should show '''TypeError: cannot concatenate 'str' and 'int' objects'''. So you made a mistake. This is called a bug. When you are trying to do programming you will see a lot of these, try reading the messages and checking the line number to see where problems and fix those bugs! So here the problem seems to be that python cannot add str and int data objects. The problem is the age value you got is a text string (‘str’ object) and python does not know that it is integer. So you need to change it to integer. Now try:

Fantastic! Now that you know how to enter something to a program, let’s do something that could be useful in real life. This is getting exciting!

Real world application: Temperature Converter:

Assume you have a British friend (with ‘niec ack’cent, of cou’se’!) and you are American. They are big on weather and you want to impress him by talking to him about temperature in degree Celsius instead of Fahrenheit. But it takes a quite a bit of effort to calculate that every time. But now you know how to program. So you can just ask your computer to do it every time!

But before that you need to solve the problem and give it an algorithm (oh, fancy!).

The equation for turning Celsius to Fahrenheit is °C = (°F  -  32)  x  5/9

So basically the algorithm should be

  1. Take Fahrenheit as input.
  2. Subtract 32 and multiply with 5/9
  3. Finally print the result

Simple, right? Let’s try it:

Oops! You see a type error. Remember how we fixed it the last time (hint: typecasting)? Nice!

But still the problem is you will always get 0. What is wrong? The issue is that when you 5/9, python does integer division and gives 0. So the whole thing becomes 0! As explained earlier, you can fix in two ways, but the easiest is changing to

cel = (fahr - 32) * 5.0 / 9.0

Awesome! Now you know how to impress your friends with your coding skills!

Exercise: You need to know the volume of a sphere. You know from math that the answer is  . Write a program like the previous one which asks for radius and gives you the volume.

(Hint: Remember to typecast that 4/3 part!)

Happy coding!

Previous Part: Part 2: Preparing for the Magic

Next Part: Coming…

All parts: Play with Python

Posted in Education, English, Python

Play with Python (Part 2): Preparing for the Magic

(After the previous part: Part 1: The Magic Behind Computers)

Summary: Install python, learn how to start coding, and plan to successfully finish the tutorial!

How to install Python:

I have lectured enough about the reasons for learning programming and the theories behind computers, but I know that you are not here for those! You want to learn how to make that game, right? But before you do so, you need to install python in your computer. First, you need python installer (skip if you are using ubuntu or linux). Get the python installer from this link: http://www.python.org/getit/

2-website-installer

You can see a lot of installers here. Don’t get confused! For this tutorial we will use 32-bit python 2.7 (at the time of the writeup the latest version is 2.7.6). So download the correct installer file to your computer (see the image above).

Windows:
As usual double click on the installer and let it get installed (click ‘next’ and ‘ok’ many times!).

OS X:
Double click on the package manager and install the package.

Ubuntu/Linux:
If you are using Ubuntu you’re all set, because Ubuntu comes with python pre-installed!

Basics and Sanity Check:

So when you write code, Python has an engine (interpreter) that turns your code into instructions that computers can read (those are basically collections of 0’s and 1’s and only computers can understand them). To give commands to computers python has a program (interface) called IDLE. Now to check if the installation worked properly, open IDLE from your programs.

2-select-idle

(Windows: Start Menu>All Programs > Python2.7>IDLE (Python GUI), in general you should get it in the list of programs in any operating system).

Now you should see the following window (minus the code text):

2-idle

Now the “>>>” is the terminal where you can enter any valid python command and it will show you the output. Now type:

print "Hello world! I am learning to code!"

And you should see the output text. Congratulations! You have written your first piece of code!

So what you just did is that you outputted some text on the screen and  print is the command for doing so in python. We will explain all these in a lot more details in the next chapters, but for now enjoy your new super power of being able to code!

Now try to do these math with python (We have also included the output. Of course you don’t need to type those!):

First thing you noticed is that I have added white space in between everything. You don’t really need to do this, but this makes your code look a lot cleaner. So before and after arithmetic signs we add these spaces. Finally, what let’s figure out what these signs are doing. The first one  + = addition, - = subtraction, * = multiplication, / = division. Let’s stop for a second– “but what are the next two?” you may ask. and you probably have not seen these before. Don’t worry, they are easy to understand!

The first one is “%” is modulo sign. In python a % b means you are asking for the remainder of a upon division by b. For example: 3 % 2 gave us 1 because that’s the remainder you get when you divide 3 by 2. This is a great way to check if a number is odd or even! If the remainder is 1 the number is odd, and even if it is 0!

The next one, ** means to the power. So a ** b means ab in math. And that means you are multiplying b number of a’s together. More explicitly: 2 ** 3 means 23 = 2 x 2 x 2 (i.e. multiplying 3, 2s).

Play with both the modulo and power operations on the terminal before these concepts make more sense!

Starting to Code:

Now you’re probably thinking, if you enter one line at a time and do simple addition and subtraction then probably you cannot do much. You are absolutely right! That’s why programmers write their programs in a simple text file and then make the language compiler software to run it. As we are becoming pros, we will do the same!

Select File > New Window (or press control + N). You should see a blank screen. This is just a fancy text editor (like notepad), but as you type you will see that the text contents are getting nice color coding (the color coding has a fancy name– syntax highlighting)! Now type:

Finally select: Run > Run Module or press F5 key on keyboard and it will probably ask you to save the file. Now give the file any name and end the name with .py (we named it first.py). It is the python file extension (like .pdf, .exe etc) and if you do this the computer will understand that it is python code! Now you can see the output printed in the terminal. Neat!

By the way, the gray texts after the hashtags (#) are called comments. When python runs the code, it goes from top to bottom, but ignores the comments. The goal of these comments is to make the code more clear to the human readers, but you can choose not to type them.

How to a complete this tutorial successfully:

Remember the last time you sat down to learn something? Yes, I do too! And I did not finish it! So it is important that you follow these instructions to learn python successfully:

  1. Set one hour (or maybe half an hour of time daily). This will make sure you are learning everyday.
  2. Go through one or two lesson(s) at a time. You might be tempted to go through everything in a single sitting, but that way you will forget everything really fast. So don’t rush and take your time.
  3. Type everything up yourself. I know it is really tempting to copy-paste the code, but don’t do it! Typing up will ingrain the lessons and commands to your brain as so called muscle memory. So after a while you will be able to code without looking at any tutorial or help!
  4. Remember the cool game we will be making at the end? Good!
  5. Finally don’t be disappointed if you don’t understand anything. Especially if you find anything unclear leave a comment below and I will try my best to help.

I know it was  lot of materials for a single lesson. But hey! Now you know how to write a legit piece of python code! If you did not understand anything– no worries, because we will cover everything in details. Just wanted to make sure you know how to run code and where to write code to run them!

Happy coding!

Previous Part: Part 1: The Magic Behind Computers

Next Part: Part 3: The Magic Starts with Variables

All parts: Play with Python

Posted in Education, English, Python Tagged , , , |

Play with Python (Part 1): The Magic Behind Computers

(After the previous part: Part 0: Introduction)

Summary: Introduction to programming and algorithms, reasons for choosing python.

What is programming?

In simple words, programming is just talking to computers. It is similar to learning a new language in the sense that you learn how to order a computer to do something for you. However, the problem is computer itself is really dumb (no matter how smart things it can do after getting programmed!). So you have to give it painfully clear and explicit instructions to make it do anything.

Before I give a concrete example of programming, imagine you have an assistant who does everything for you. However, the only problem is that he is really dumb (still smarter than a computer). Yesterday you asked him to make simple peanut-butter sandwich for your breakfast and he made a mess! So today you are giving him instructions:

  1. Get whole-grain breads
  2. Put peanut-butter on them
  3. And finally put those breads together

No matter how dumb your assistant is, hopefully after this instruction he will be able to make a sandwich, but if you ask a computer (assume it is a robot with hands!) to do this job, it will still fail because all of the statements are still ambiguous. Let’s go through them again:

  1. Get whole-grain breads:  How many? 1, 2, 100?
  2. Put peanut-butter on them: Both side single side? How to put peanut-butter? What to use? How much peanutbutter to use?
  3. And finally put those breads together: How should we put them together?

So let us write another instruction, which is very explicit and clear:

  1. Get 2 whole-grain breads, 1 peanutbutter jar, 1 butter knife, 1 table spoon
  2. Take one of the breads and put 1 table spoon on peanutbutter on it.
  3. Then smooth out the butter with a knife.
  4. Similarly do the other loaf.
  5. Finally put two breads so that the faces with peanutbutter on top are facing each other!

Now you understand what I mean by painfully unambiguous and clear! Watch the video from CS 50 to see how things can go wrong if you don’t have a precise algorithm.

 

The recipe you just made is called “algorithm” in computer programming. Before you solve any problem you need to solve it yourself and give the computer all the ingredients and the recipe (algorithm) to solve it.

Now you may ask, why do we use computer if we need to solve the problem ourselves first. Very good question! The answer is that a computer is very, very fast compared to a human being. Once you teach the computer how to solve 1 class of problems, it can solve any of those problems in future extremely fast. This great advantage of fast automation is what made computers so popular.

If you want to know a bit more about algorithms you should check this nice animated tutorial video by Dr. David Malan, the famous CS50 course instructor from the Harvard University.

 

 

Why python?

harry-1

 

You probably thought about a big snake when you heard the name python. Here we are of course talking about the computer programming language Python. Like there are many languages in the world, there are many programming languages to talk to computers. So you may ask, why I chose python and why you should learn python instead of any other languages. In general, people use different languages for different purpose, but python is one of the very few languages that is used for almost everything. So if you learn python you can pretty much do anything!

That is probably the main reason python got so popular in the last few years.

python-perc-1

 

Apart from that here is my laundry-list of reasons for which I love python and you should too:

  1. Python is smarter. In python you don’t need to tell the computer that you are working with a number when you can clearly see that it is a number.
  2. As a result, you need to write a lot less code than most other languages for doing the same work, which is great if you are lazy like me!
  3. As python is very popular people have written a lot of code in it and made many libraries (extra functionalities) that are freely available for your use. So basically no matter what you want to do with python almost always you will find a library. So instead of working hard and coding that functionality, you can work harder on real problem solving and do more with less code!
  4. Finally according to xkcd, python helps you to fly! :D fly-python

Previous Part: Part 0: Introduction

Next Part: Part 2: Preparing for the Magic

All parts: Play with Python

Posted in Education, English, Python Tagged , , , , |