Wednesday, August 25, 2010

Contributing to Open Source - SFD 2010

Today I presented "Contributing to Open Source" as a part of Software Freedom Day @ Alexandria University 2010 held at Bibliotheca Alexandrina. The audience was great and helped in making the presentation better than I anticipated.

This is the presentation I hope it would be useful to those that missed the event.


It was cool to be presenting at the BA for the first time.

Monday, August 2, 2010

Research Opportunities in Egypt

Lately I have been exploring the research communities and opportunities in Egypt, specially Computer Science oriented research groups. I was amazed by the number of opportunities that are available now specially for a fresh graduate like myself.

I'll try to list some of the opportunities I have came across and that I consider tempting.

1. Cairo Microsoft Innovation Center (CMIC) :
One of the best research centers and it collaborates with a lot of the institutions that follows. There you'll get the chance to work with cutting edge technologies developing projects that will directly help the Arabic community.

2. Government Funded Research Projects:
Development funds like ITIDA, NTRA and STDF are providing support for research projects and IT projects on all level. It's quite interesting to be working on an independent research project that totally matches your interests with absolutely no restrictions and get payed.

3. E-JUST & Nile University:
Working on a research project at either of these universities will offer you the chance of working in an environment like no other. It tries to match the graduate studies experience in the US. Although there are some restrictions that you don't find in independent research project, you get to finish your master's degree at one of these institutions and get payed a better salary.

4. R&D Startups:
Recently R&D startups started to appear in Egypt. They offer a pretty intense experience offering the challenges and excitement  that you'll find at any research project while focusing on specific targets, deadlines and specifications. Of course the salary is better than universities and independent research projects, but publications are not guaranteed.

5. Mubarak City for Science and Technology:
This one is pretty new to me. I have just learned about the options there. Some of them seems promising but the environment is awfully similar to those of public universities. Although the salary is a little bit higher than a typical Teaching Assistant position at any public university you'll end up getting you master's degree from a public university. There are specific tracks which doesn't provide a lot of flexibility but some of them are quite interesting.

6. Bibliotheca Alexandrina:
The BA has A LOT going on with different teams working on many interesting projects. They have a super computer (the only I know of in Egypt). Also their books digitization system is recognized as one of the best systems in the world. The salary is quite good but it's not easy either to get in or get out.

7. Masters Degree in Public Universities:
This is the last sanctuary to fresh graduates seeking research opportunities. The problem with this option is that any work requires commitment and this track doesn't include a payroll so you end up working at software house either part or full time. This degrades your efficiency greatly regarding time available to do the needed work or even studying the master's courses.

All the above statements are totally subjective and based on my own opinion and the opinions of anonymous people that had experience at each of those institutions.

Thursday, July 22, 2010

A Research Graduation Project Accepted in PACT 2010

I have been working throughout the past year on my graduation project. The main target of the project was accelerating a Numerical Weather Prediction model (the Weather Research and Forecasting model) using GPUs. We implemented the latest version of one of the modules and our implementation was published as part of the latest release of a system supported by the United States Air Force and the National Oceanic and Atmospheric Agency (NOAA).
After our thesis defense we decided to go on with this work and submit it as a research poster to a high ranking conference and we chose PACT. International Conference on Parallel Architectures and Compilation Techniques (PACT) is a top tier, rank 1 conference. We submitted our work as a research poster for the ACM Student Research Competition in the Undergraduate Category (I have already particpated at an ACM SRC at MobiCom'09 last summer).
I was informed on 22nd of July 2010 that our work "Quantifying the Impact of GPU Specific Optimizations: An Experimental Study on a Weather Forecasting Application" was accepted and the judges' comments were great. This is yet another proof that Egyptian Public Universities, with no fund whatsoever, can produce high quality research on the undergraduate level.

Waiting for your comments :).

Sunday, May 9, 2010

CUDA compilation on Linux (Makefiles)

I have been struggling with compiling complex CUDA code, with complex I mean a lot of .cu .F .cpp and .c files. I have created a makefile that can help you compile that kind of code. I'll try to give a quick view of it here explaining some parts:

First you'll need to specify the CUDA path and the compilers and linkers. Here I am using gcc and g++ for .c and .cpp files and nvcc for .cu files:

CUDA_INSTALL_PATH ?= /usr/local/cuda

CXX := g++
CC := gcc
LINK := g++ -fPIC
NVCC := nvcc -ccbin /usr/bin

Then you need to specify where to locate the CUDA header files :

# Includes
INCLUDES = -I. -I$(CUDA_INSTALL_PATH)/include

# Common flags
COMMONFLAGS += $(INCLUDES)
NVCCFLAGS += $(COMMONFLAGS)
CXXFLAGS += $(COMMONFLAGS)
CFLAGS += $(COMMONFLAGS)


Then you specify where to locate the CUDA binaries for linking :

LIB_CUDA := -L$(CUDA_INSTALL_PATH)/lib -lcudart


Then you specify the object files that the executable depends on and the name of the executable flle:

OBJS = sample_cuda_objectfile.cu.o sample_cpp_objectfile.cpp.o
TARGET = exec
LINKLINE = $(LINK) -o $(TARGET) $(OBJS) $(LIB_CUDA)


Finally you specify the compilation rules. The following are generic rules so all you need to do is change the OBJS and TARGET variables for compilation:

.SUFFIXES: .c .cpp .cu .o


%.c.o: %.c
$(CC) $(CFLAGS) -c $< -o $@

%.cu.o: %.cu
$(NVCC) $(NVCCFLAGS) -c $< -o $@

%.cpp.o: %.cpp
$(CXX) $(CXXFLAGS) -c $< -o $@

$(TARGET): $(OBJS) Makefile
$(LINKLINE)

Tuesday, May 4, 2010

Contributing to Open Source (Why? and How?)

Open Source products are everywhere now, probably now you're using some of them. And I am sure that a lot of people are appreciating the role of open source. This post is about giving back to the community, helping others and moving forward one of the applications/tools you're using as a part of your everyday life.

Why Contribute to Open Source??

Sun Microsystems had a slogan "Change (Y)our world". And that summarizes what contributing to open source is all about. It's about making You better and making The World better. Let's make it clearer.

When you contribute to open source it helps you in some many ways, it gets you to code and practice and enhance your programming skills. Then gets your code reviewed, reviewed and then reviewed once more before it becomes a part of an official release. And finally it gets you to document your code.
Of course how much a project adds to your skills depends on the scale of the project and the importance of the feature you're implementing.

On the other hand, when you contribute to open source you're giving back to the community, helping improve the project you're contributing to and hence making someone's life easier by introducing a feature or fixing a bug and your code is out there for people to learn for it.


How to Contribute to Open Source??

First, you'll have to choose a project.
  • Start small, choose small project with a small code base that you can get familiar with easily and you can get the people who wrote the code easily too.
  • Choose a project you use, so you can be able to know what it needs and have clear vision of what's used for, what type of users you're targeting and what's missing that you once needed.
  • Choose a project you like, and that way you'll be able to give it the time and get to have a reason to commit some time for developing, testing and documenting.
You can start by scanning active projects on sourceforge or github where you can find active small scale projects.


Second, start communicating with the development community.
  • Join the project's development mailing lists and start scanning the e-mails to sense the trends and topics of interest to the community currently.
  • Join the project's IRC channel and start asking the questions you have (search for answers on the mailing lists, wiki and blog posts before going to IRC so you get an answer).
  • Chekout the code and compile it.
  • Check the project's bug tracker and try fixing a bug or two before working on a feature.

Finally if you find that you can work on the project CHECK FEATURES LIST START CODING.

Wednesday, March 31, 2010

KNN Algorithm and KD-Trees

I am taking a Pattern Recognition course this semester, it's not really one of my favourite topics but I am doing OK. We had an assignment last week to test and compare different classification algorithms. One of the algorithms we were asked to implement was KNN (K Nearest Neighbours). We were asked to find a way that makes searching for the K Nearest Neighbours faster; that's what this post about.

The problem briefly is that: Given two sets of K dimensional points find the N nearest points (using Euclidean distance or any other measurement) to a K dimensional point X.

The Naive Solution:
Go through the list, find the distance between X and all the points in the two sets, get the smallest N elements in the distances list... Simple enough.
As part of our assignment we were given a dataset of 100,000 points which proved this algorithm to be very slow.


My Naive WRONG Solution:
I thought that it was easy to index the points in a way that makes it efficient to find the nearest points to a given point. Multiple Dimensional Sorting... It was totally wrong because two points might have the same values of m dimensions but the values of the other (K-m) dimensions put them far away from each other.

Possible Indexing Solutions:
There are different ways to index K Dimensional Points based on how near they are to each other. I will mention here KD-Trees.

KD-Trees:
K dimensional trees is a binary tree that is based on space partitioning. It's used to index multi-dimensional data.
The idea behind it is using the tree to navigate through space partitions while decreasing the size of each partition as you go through the tree.
The figure represents a simple 3d-tree. The dimensions are X,Y and Z. Constructing such a tree is much similar to a conventional binary search tree but differs in only that splitting mechanism uses a different dimension at each level and keeps iterating on them in the same order.
The root point in the figure splits the space into two partitions X>5  and X<5. On the second level of the tree the X>5 partition is split into two partitions Y>3 and Y<3 and so on.

So as you can see by navigating through the tree you get a good approximation of the nearest points to the point you're searching for. But because the point you're searching for might be on the edge of one of the partitions a normal binary search will not always yield the correct the real nearest neighbour so re-checking is needed.

To do that re-checking the algorithm saves the point obtained by a normal binary search as the current best. Then it iterates on the nodes already visisted in the tree to check for two things:
  1. If any of those nodes' distance is smaller than the current best.
  2. If there exists a partition (branch) that might contain points with smaller distances. That's done by comparing the distance between the point we're searching for and the splitting point on the adjacent partition (the sibling of the node currently being examined) and the current best distance.
I have made a simple implementation of KD-Trees and a nearest neighbour search algorithm in matlab. You can find it here.

Wednesday, February 17, 2010

Weather Research and Forecasting Model Explained (Part 3) - System Architecture and Parallelism

WRF ,as mentioned in part 1 and part2, is a system designed for both operational and research purposes, and to run on systems from conventional laptops to super computers. The design of such a versatile system is a real challenge.
In this post we will explore some of the WRF's Software Framework design and explain the functionality of the different layers of the system. Also we'll discuss how the system handles parallelism in different hardware environments.
Software Framework Architecture[1][2]:
(The following functionalities doesn't include parallelism handling)

 Figure1: Hierarchical WRF Software Design
As it appears in Figure 1, the system design has 3 layers:
The Driver Layer:
It defines and allocates domains and associates them in a nest hierarchy. It also manages the flow of a system run as it's responsible for stepping the time for the loop mentioned in part1. And it's responsible for I/O operations. 
The Mediation Layer:
It interfaces between the two other layers. For example the driver layer uses Fortran 90's derived data-types while the model layer deals with scalar arrays only. The mediation layer is responsible for dereferencing the scalar arrays from the derived data types.
The Model Layer:
It performs the actual weather prediction computations like physics calculations and time integration.

Parallelism (Two Level decomposition):

Due to the versatileness of the system mentioned before, parallel systems including shared memory and distributed memory architectures. Two level decomposition addresses that problem by decomposing domains in patches and decomposing patches into tiles. With patches being "a section of a model domain allocated to a single memory address space" and tiles being "the amount of a patch that is allocated to a one thread of execution and every cell of a patch that requires computation must fall into some tile"[2].
The driver layer is responsible for distributed and shared memory allocation. Also it handles the decomposition of domains into patches and patches into tiles. While the model layer handles only tiles. This part reflects one of the best design features of the WRF Software Framework; Model layer is meant to be programmed by meteorological scientists not computer scientists, so handling parallelism issues (i.e. synchronization, memory allocation, communication, etc) was made part of the driver layer leaving the model layer to have only "meteorological calculators"-tiles callable subroutines.

[1] John G. Michalakes, Michael McAtee and Jerry Wegiel, SOFTWARE INFRASTRUCTURE FOR THE WEATHER RESEARCH AND FORECAST MODEL
[2] Weather Research and Forecast Model 1.2: Software Design and Implementation

Friday, February 12, 2010

Weather Research and Forecasting Model Explained (Part 2) - Nesting and System Components

In the part 1 we introduced WRF and discussed the meteorological basics of Advanced Research WRF. In this part we discuss the concept of nesting and explain different system parts of WRF. But first we'll define two basic terms.
Grid: Raw atmospheric data layed out as a discrete grid for an area of the map.
Figure 1 [1] Grid examples
Domian: A model created by processing grid information and making it ready for processing by WRF Model.
Nesting:
Nesting is a way of getting high resolution data from low resolution data.That's achieved by introducing more grid or grids to the initial state. Those finer grain grids force the need for more high resolution data. That data is obtained by interpolating data from coarse grain grids or could be user input.
Figure 2 [1] Nested grid
WRF System Components:

Figure3 [1] WRF system components
As it appears in figure 1, the system has 3 main components: Preprocessor (WPS), the WRF Model and Post Processor.
The WPS is responsible for preparing the input (initial conditions and lateral boundary conditions) to ARW for real-data simulations by [2]
  1. Defining simulation domain and nested domains.
  2. Computing latitude, longitude, map scale factors for every grid point
  3. Interpolates time-invariant terrestrial data to simulation grids (e.g., terrain height and soil type)
  4. Interpolates meteorological fields from another model onto simulation domains
The WRF Software Framework is the core engine of the system, it has the Dynamics solver, physics package and WRF-Chem and the interfacing between them. That core has all the equations mentioned in part 1.
The Post Processor, it processes the output of the WRF WRF Software Framework and converts it to GRIB format. For more information check [3] and [4].

[1] A Description of the Advanced Research WRF Version 3, http://www.mmm.ucar.edu/wrf/users/docs/arw_v3.pdf
[2] WPS Basics, http://www.mmm.ucar.edu/people/duda/files/wps_files/WPS-basics.pdf
[3] NCEP WRF Post Processor User Guide, http://www.emc.ncep.noaa.gov/mmb/papers/chuang/2/wrfpost.txt
[4] User's Guide for the NMM Core of the Weather Research and Forecast (WRF) Modeling System Version 3, Chapter 7: Post Processing Utilities.

Thursday, February 4, 2010

Weather Research and Forecasting Model Explained (Part 1) - Weather Prediction Basics

My goal here is to give a simple overview of the WRF for computer science students.
The Weather Resarch and Forecasting (WRF, pronounced "worf") model is an open-source numerical weather prediction (NWP) and atmospheric simulation system designed for both research and operational applications [1]. This system was developed as joint effort between National Center of Atmospheric Research, National Oceanic and Atmospheric Administration (NOAA), Air Force Weather Agency and others. In this post ,and the posts to follow, I'll try to explain the WRF from two different perspectives, the first being Meteorology and the second being Computer Science.
I am here concerned with the Advanced Research WRF (ARW) subset of WRF.
Weather Prediction Basics:
Weather Prediction means to predict the state of the atmosphere in terms of (pressure, temperature, speed of wind, etc). Our atmosphere is controlled by flow and the conditions of the air across the planet and by predicting that flow we can predict the weather conditions in the area of interest. The prediction is done by providing a system with information about the current state of the area of interest. Then the system uses that information as the parameters for its equations (usually difference equations) to produce future conditions for that area.
Euler's Equations:
Advanced Research WRF's dynamics engine integrates Euler's Equations to predict air flow.
Euler's equations are defined as a set of differential equations used in fluid dynamics to govern invscid flow (Wikipedia).
Being continuous equations, Euler's equations are discretized. Both temporal and spacial discretization are needed. So the surface of the earth is mapped as a grid ,based on the projection method used to project the spheric shape of earth, where all variable are defined on that grid.
Figure 1 Spacial discretization
(V,U,W are wind velocities, θ is the potential temperature)
For temporal discretization, the Euler's equations are integrated using a time-split integration scheme called Runge-Kutta method.

The following loop is used to integrate the equations and this loop is the main loop in the system (i.e. all other components are used to provide this loop with the information needed to compute the current time step)

Begin Time Step
Begin RK3 Loop:
Solve the equations
End RK3 Loop
Compute non-RK3 physics
End Time Step
Note: This is a very simple version of the loop for the complete version reference [1].

Initial and lateral boundary conditions:
For the system to be able to predict the weather for a certain area it doesn't only need its initial conditions but it also needs a way to know its boundary conditions as of course adjacent areas affect the area of interest. To solve this problem, several methods of generating lateral boundary data are used including periodic generating in which the lateral boundary conditions are repeated along the x-axis or y-axis or both.

[1] A Description of the Advanced Research WRF Version 3