gonionoo – Go wrapper for the Tor Network Status Protocol – OnionOO

I’ve bene running a Tor exit node in the Netherlands since August 2013. I believe in the cause of Tor and it was only a matter of time before I started adding code in some for or another.

gonionoo is Go wrapper for OnionOO – the Tor Network Status protocol as is the first step in a slightly larger project I’m working on that I’ve been planning for a while ever since I’ve became a Tor exit node operator.

The OnionOO API has lots of interesting data on the Tor network. You can see it visualized as part of the Atlas project.

Disco Tip – Crunching web server logs

At my day job we use Disco, a Python + Erlang based Map-Reduce framework, to crunch our web servers and application logs to generate useful data.

Each web server log file per day is a couple of GB of data which can amount to a lot of log data that needs to be processed on a daily.

Since the files are big it was easier for us to perform all the necessary filtering of find the rows of interest in the “map” function. The problem is, that it requires us to return some generic null value for rows that are not interesting for us. This causes the intermediate files to contains a lot of unnecessary data that has the mapping of our uninteresting rows.

To significantly reduce this number, we have started to use the “combiner” function so that our intermediate results contains an already summed up result of the file the node is currently processing that is composed only from the rows we found interesting using the filtering in the “map” phase.

For example, if we have 1,000 rows and only 200 answer a certain filtering criteria for a particular report, instead of getting 1,000 rows in the intermediate file out of which 800 have the same null value, we now get only 200 rows.

In some cases we saw an increase of up to 50% in run time (the increase in speed is the result of reducing less rows from the intermediate files), not to mention a reduction in disk space use during execution due to the smaller intermediate files.

That way, we can keep the filtering logic in the “map” function while making sure we don’t end up reducing unnecessary data.

Don Dodge, Google and Developers Evangelism

I was just reading over at TechCrunch about Google quickly hiring Don Dodge after he was let go from Microsoft. It seems Don will be doing what he used to do at Microsoft – Developer Evangelism (good for him, and Google!).

I’m very happy to see that Google is putting their stock options and cash where their mouth is to evangelize their APIs, platforms (Android, AppEngine) and tools to developers.

A while back I wrote about the lack of Google’s outreach in the Israeli developers community, and it is still very visible in Israel by the jobs listings as well as various events and conventions that Microsoft Technology still dominates the Israeli high-tech software scene.

I do hope that hiring Don Dodge and keep on releasing tools, SDKs, Platforms and even languages such as the new Go programming language, to create the necessary diversification that every monopolized field needs.

I just hope that Google will start to do more than just very simple and shallow Dev Days in Israel and will start reaching out the community, specifically in Israel. I would like to see a Google I/O event in Israel and may be a couple of smaller events that dig down into code and details in a more intimate scenario with less people.

In general I would expect Google to start evangelizing in other countries and start having evangelists in every country they have an office. I would suggest Google to learn a bit from MSDN as well as the Microsoft Valued Professional (MVP) program – these tools are one of the best examples of creating a community based on core leaders that can drive the community as well as Google straight up.

Google is still light years from reaching the well oiled, well organized Microsoft evangelism machine and I hope Don and other will be able to make big leaps to close that gap.

Google AppEngine – Python – issubclass() arg 1 must be a class

If you are getting the error “”issubclass() arg 1 must be a class”” with Google App Engine SDK for Python on Linux its probably because you are running Python 2.6 (and will probably happen to you when you run Ubuntu 9.04 – 2.6 is the default there).

Just run the dev server under python 2.5 (i.e. python2.5 dev_appserver.py)