Sunday, August 28, 2011 0 comments

Open Science - SFD 2011

For the third year I was invited to give a talk at Software Freedom Day @ Alexandria University 2011 and I wanted to talk about something that links what I am doing now (I am a research assistant at E-JUST) and open source so I started researching Open Source Science.

Starting from the fact that I use a lot of open source modules and packages in my work developed by others and put on the internet for everyone to use. I began to understand the wider concept of open science which aims at enabling everyone who is willing to participate in any scientific endeavor to have a part in it.

Facilitated by the internet, ideas, tasks, documents, code, scientific data and even videos of experiments are shared by the project's coordinator for anyone to participate in anyway and every way possible. And the results are amazing !!

My main goal from the presentation and this post is to show how openness can boost research either by using the collective intelligence of all participants (as in the PolyMath project or InnoCentive) or reducing the time or the resources need as in science for citizens organization.

Other aspects of open science include making all the products of scientific projects available to the public for everyone to learn from and build on, saving money and time for other projects.

One of the best introduction you can have to open science is the presentation made by Michael Nielsen at TEDxWaterloo.



Also below is the presentation I gave at Software Freedom Day @ Alexandria University 2011.



Wednesday, August 25, 2010 1 comments

Contributing to Open Source - SFD 2010

Today I presented "Contributing to Open Source" as a part of Software Freedom Day @ Alexandria University 2010 held at Bibliotheca Alexandrina. The audience was great and helped in making the presentation better than I anticipated.

This is the presentation I hope it would be useful to those that missed the event.


It was cool to be presenting at the BA for the first time.
Monday, August 2, 2010 2 comments

Research Opportunities in Egypt

Lately I have been exploring the research communities and opportunities in Egypt, specially Computer Science oriented research groups. I was amazed by the number of opportunities that are available now specially for a fresh graduate like myself.

I'll try to list some of the opportunities I have came across and that I consider tempting.

1. Cairo Microsoft Innovation Center (CMIC) :
One of the best research centers and it collaborates with a lot of the institutions that follows. There you'll get the chance to work with cutting edge technologies developing projects that will directly help the Arabic community.

2. Government Funded Research Projects:
Development funds like ITIDA, NTRA and STDF are providing support for research projects and IT projects on all level. It's quite interesting to be working on an independent research project that totally matches your interests with absolutely no restrictions and get payed.

3. E-JUST & Nile University:
Working on a research project at either of these universities will offer you the chance of working in an environment like no other. It tries to match the graduate studies experience in the US. Although there are some restrictions that you don't find in independent research project, you get to finish your master's degree at one of these institutions and get payed a better salary.

4. R&D Startups:
Recently R&D startups started to appear in Egypt. They offer a pretty intense experience offering the challenges and excitement  that you'll find at any research project while focusing on specific targets, deadlines and specifications. Of course the salary is better than universities and independent research projects, but publications are not guaranteed.

5. Mubarak City for Science and Technology:
This one is pretty new to me. I have just learned about the options there. Some of them seems promising but the environment is awfully similar to those of public universities. Although the salary is a little bit higher than a typical Teaching Assistant position at any public university you'll end up getting you master's degree from a public university. There are specific tracks which doesn't provide a lot of flexibility but some of them are quite interesting.

6. Bibliotheca Alexandrina:
The BA has A LOT going on with different teams working on many interesting projects. They have a super computer (the only I know of in Egypt). Also their books digitization system is recognized as one of the best systems in the world. The salary is quite good but it's not easy either to get in or get out.

7. Masters Degree in Public Universities:
This is the last sanctuary to fresh graduates seeking research opportunities. The problem with this option is that any work requires commitment and this track doesn't include a payroll so you end up working at software house either part or full time. This degrades your efficiency greatly regarding time available to do the needed work or even studying the master's courses.

All the above statements are totally subjective and based on my own opinion and the opinions of anonymous people that had experience at each of those institutions.
Thursday, July 22, 2010 10 comments

A Research Graduation Project Accepted in PACT 2010

I have been working throughout the past year on my graduation project. The main target of the project was accelerating a Numerical Weather Prediction model (the Weather Research and Forecasting model) using GPUs. We implemented the latest version of one of the modules and our implementation was published as part of the latest release of a system supported by the United States Air Force and the National Oceanic and Atmospheric Agency (NOAA).
After our thesis defense we decided to go on with this work and submit it as a research poster to a high ranking conference and we chose PACT. International Conference on Parallel Architectures and Compilation Techniques (PACT) is a top tier, rank 1 conference. We submitted our work as a research poster for the ACM Student Research Competition in the Undergraduate Category (I have already particpated at an ACM SRC at MobiCom'09 last summer).
I was informed on 22nd of July 2010 that our work "Quantifying the Impact of GPU Specific Optimizations: An Experimental Study on a Weather Forecasting Application" was accepted and the judges' comments were great. This is yet another proof that Egyptian Public Universities, with no fund whatsoever, can produce high quality research on the undergraduate level.

Waiting for your comments :).
Sunday, May 9, 2010 0 comments

CUDA compilation on Linux (Makefiles)

I have been struggling with compiling complex CUDA code, with complex I mean a lot of .cu .F .cpp and .c files. I have created a makefile that can help you compile that kind of code. I'll try to give a quick view of it here explaining some parts:

First you'll need to specify the CUDA path and the compilers and linkers. Here I am using gcc and g++ for .c and .cpp files and nvcc for .cu files:

CUDA_INSTALL_PATH ?= /usr/local/cuda

CXX := g++
CC := gcc
LINK := g++ -fPIC
NVCC := nvcc -ccbin /usr/bin

Then you need to specify where to locate the CUDA header files :

# Includes
INCLUDES = -I. -I$(CUDA_INSTALL_PATH)/include

# Common flags
COMMONFLAGS += $(INCLUDES)
NVCCFLAGS += $(COMMONFLAGS)
CXXFLAGS += $(COMMONFLAGS)
CFLAGS += $(COMMONFLAGS)


Then you specify where to locate the CUDA binaries for linking :

LIB_CUDA := -L$(CUDA_INSTALL_PATH)/lib -lcudart


Then you specify the object files that the executable depends on and the name of the executable flle:

OBJS = sample_cuda_objectfile.cu.o sample_cpp_objectfile.cpp.o
TARGET = exec
LINKLINE = $(LINK) -o $(TARGET) $(OBJS) $(LIB_CUDA)


Finally you specify the compilation rules. The following are generic rules so all you need to do is change the OBJS and TARGET variables for compilation:

.SUFFIXES: .c .cpp .cu .o


%.c.o: %.c
$(CC) $(CFLAGS) -c $< -o $@

%.cu.o: %.cu
$(NVCC) $(NVCCFLAGS) -c $< -o $@

%.cpp.o: %.cpp
$(CXX) $(CXXFLAGS) -c $< -o $@

$(TARGET): $(OBJS) Makefile
$(LINKLINE)
Tuesday, May 4, 2010 2 comments

Contributing to Open Source (Why? and How?)

Open Source products are everywhere now, probably now you're using some of them. And I am sure that a lot of people are appreciating the role of open source. This post is about giving back to the community, helping others and moving forward one of the applications/tools you're using as a part of your everyday life.

Why Contribute to Open Source??

Sun Microsystems had a slogan "Change (Y)our world". And that summarizes what contributing to open source is all about. It's about making You better and making The World better. Let's make it clearer.

When you contribute to open source it helps you in some many ways, it gets you to code and practice and enhance your programming skills. Then gets your code reviewed, reviewed and then reviewed once more before it becomes a part of an official release. And finally it gets you to document your code.
Of course how much a project adds to your skills depends on the scale of the project and the importance of the feature you're implementing.

On the other hand, when you contribute to open source you're giving back to the community, helping improve the project you're contributing to and hence making someone's life easier by introducing a feature or fixing a bug and your code is out there for people to learn for it.


How to Contribute to Open Source??

First, you'll have to choose a project.
  • Start small, choose small project with a small code base that you can get familiar with easily and you can get the people who wrote the code easily too.
  • Choose a project you use, so you can be able to know what it needs and have clear vision of what's used for, what type of users you're targeting and what's missing that you once needed.
  • Choose a project you like, and that way you'll be able to give it the time and get to have a reason to commit some time for developing, testing and documenting.
You can start by scanning active projects on sourceforge or github where you can find active small scale projects.


Second, start communicating with the development community.
  • Join the project's development mailing lists and start scanning the e-mails to sense the trends and topics of interest to the community currently.
  • Join the project's IRC channel and start asking the questions you have (search for answers on the mailing lists, wiki and blog posts before going to IRC so you get an answer).
  • Chekout the code and compile it.
  • Check the project's bug tracker and try fixing a bug or two before working on a feature.

Finally if you find that you can work on the project CHECK FEATURES LIST START CODING.
Wednesday, March 31, 2010 1 comments

KNN Algorithm and KD-Trees

I am taking a Pattern Recognition course this semester, it's not really one of my favourite topics but I am doing OK. We had an assignment last week to test and compare different classification algorithms. One of the algorithms we were asked to implement was KNN (K Nearest Neighbours). We were asked to find a way that makes searching for the K Nearest Neighbours faster; that's what this post about.

The problem briefly is that: Given two sets of K dimensional points find the N nearest points (using Euclidean distance or any other measurement) to a K dimensional point X.

The Naive Solution:
Go through the list, find the distance between X and all the points in the two sets, get the smallest N elements in the distances list... Simple enough.
As part of our assignment we were given a dataset of 100,000 points which proved this algorithm to be very slow.


My Naive WRONG Solution:
I thought that it was easy to index the points in a way that makes it efficient to find the nearest points to a given point. Multiple Dimensional Sorting... It was totally wrong because two points might have the same values of m dimensions but the values of the other (K-m) dimensions put them far away from each other.

Possible Indexing Solutions:
There are different ways to index K Dimensional Points based on how near they are to each other. I will mention here KD-Trees.

KD-Trees:
K dimensional trees is a binary tree that is based on space partitioning. It's used to index multi-dimensional data.
The idea behind it is using the tree to navigate through space partitions while decreasing the size of each partition as you go through the tree.
The figure represents a simple 3d-tree. The dimensions are X,Y and Z. Constructing such a tree is much similar to a conventional binary search tree but differs in only that splitting mechanism uses a different dimension at each level and keeps iterating on them in the same order.
The root point in the figure splits the space into two partitions X>5  and X<5. On the second level of the tree the X>5 partition is split into two partitions Y>3 and Y<3 and so on.

So as you can see by navigating through the tree you get a good approximation of the nearest points to the point you're searching for. But because the point you're searching for might be on the edge of one of the partitions a normal binary search will not always yield the correct the real nearest neighbour so re-checking is needed.

To do that re-checking the algorithm saves the point obtained by a normal binary search as the current best. Then it iterates on the nodes already visisted in the tree to check for two things:
  1. If any of those nodes' distance is smaller than the current best.
  2. If there exists a partition (branch) that might contain points with smaller distances. That's done by comparing the distance between the point we're searching for and the splitting point on the adjacent partition (the sibling of the node currently being examined) and the current best distance.
I have made a simple implementation of KD-Trees and a nearest neighbour search algorithm in matlab. You can find it here.