Varnish High, Ever increasing CPU usage workaround

If you are using Varnish version >= 2.1 and experiencing an ever increasing CPU usage up to a point where you need to restart the service to force CPU usage to drop you may want to add the “-h classic” argument to the command line.

This will revert to use the older hashing method instead of the newer “critbit” that was first introduced in version 2.1.

You can read a little bit more about it on the Yedda Dev Blog.

Disco Tip – Crunching web server logs

At my day job we use Disco, a Python + Erlang based Map-Reduce framework, to crunch our web servers and application logs to generate useful data.

Each web server log file per day is a couple of GB of data which can amount to a lot of log data that needs to be processed on a daily.

Since the files are big it was easier for us to perform all the necessary filtering of find the rows of interest in the “map” function. The problem is, that it requires us to return some generic null value for rows that are not interesting for us. This causes the intermediate files to contains a lot of unnecessary data that has the mapping of our uninteresting rows.

To significantly reduce this number, we have started to use the “combiner” function so that our intermediate results contains an already summed up result of the file the node is currently processing that is composed only from the rows we found interesting using the filtering in the “map” phase.

For example, if we have 1,000 rows and only 200 answer a certain filtering criteria for a particular report, instead of getting 1,000 rows in the intermediate file out of which 800 have the same null value, we now get only 200 rows.

In some cases we saw an increase of up to 50% in run time (the increase in speed is the result of reducing less rows from the intermediate files), not to mention a reduction in disk space use during execution due to the smaller intermediate files.

That way, we can keep the filtering logic in the “map” function while making sure we don’t end up reducing unnecessary data.

Ubuntu 9.10 Karmic Koala and ies4linux – Installation

Installing ies4linux on Ubuntu 9.10 Karmic Koala by just running “./ies4linux” might show some warnings such as:

IEs4Linux 2 is developed to be used with recent Wine versions (0.9.x). It seems that you are using an old version. It’s recommended that you update your wine to the latest version (Go to: winehq.com).

In my case it showed the above text, which seems to be a warning, and run the UI but then got stuck and didn’t complete anything.

To overcome this issue simply run the installation without the GTK based UI in a terminal window:

./ies4linux –no-gui

That’s it. Works like a charm.

Message in a bottle

Launching a startup is like sending a message in a bottle. If the message is not clear, no one will come to visit your lonely island or send you a postcard back.Message in a bottle

When you launch your startup, your online presence (i.e. website, twitter account, facebook page, etc) and the buzz you manage to create online via the online official and unofficial press are the message you are passing to your users. If the message is not clear you can lose a lot of attention.

Before launching your startup you might want to test your messaging. I propose two very simple tests that can serve as rather good markers to determine if your message is clear.

Tests Rules:

  • Each of the tests should be given to 2 different people
  • These people should have no prior knowledge of your startup and what it does
  • One person should be a Non-Techie – someone not from the tech industry who is known to have little to no technical background. The other should be a Techie – someone from the tech industry that can eventually ask a question along the lines of “How are you going to implement this?” and understand the answer.
  • Each test should have a different set of people, you cannot reuse people from one test in the other test.

Test #1 – One sentence or less (or the 140 character pitch)

Tell each of the 2 people in one sentence or less what your startup does. If they don’t ask for additional clarification then you can consider that your message is rather clear. If they don’t ask for additional clarification, but are rather intrigued by your startup and message you can safely assume your message is clear enough and your startup does interest them.

Test #2 – The blind website test

Show your website to the 2 people without saying a word. Ask them to read what is written and explain to you what they think your startup is about. If they can explain it and understand completely what you are doing you can be certain enough that your web site message is clear even to new users who has no prior knowledge of what your startup does.

If you did not pass one of the test, try it again on a different set of people (just to make sure these 4 are not a statistical anomaly). If the result is still the same try to revise your messaging and, as always, remember to rinse and repeat.

Don Dodge, Google and Developers Evangelism

I was just reading over at TechCrunch about Google quickly hiring Don Dodge after he was let go from Microsoft. It seems Don will be doing what he used to do at Microsoft – Developer Evangelism (good for him, and Google!).

I’m very happy to see that Google is putting their stock options and cash where their mouth is to evangelize their APIs, platforms (Android, AppEngine) and tools to developers.

A while back I wrote about the lack of Google’s outreach in the Israeli developers community, and it is still very visible in Israel by the jobs listings as well as various events and conventions that Microsoft Technology still dominates the Israeli high-tech software scene.

I do hope that hiring Don Dodge and keep on releasing tools, SDKs, Platforms and even languages such as the new Go programming language, to create the necessary diversification that every monopolized field needs.

I just hope that Google will start to do more than just very simple and shallow Dev Days in Israel and will start reaching out the community, specifically in Israel. I would like to see a Google I/O event in Israel and may be a couple of smaller events that dig down into code and details in a more intimate scenario with less people.

In general I would expect Google to start evangelizing in other countries and start having evangelists in every country they have an office. I would suggest Google to learn a bit from MSDN as well as the Microsoft Valued Professional (MVP) program – these tools are one of the best examples of creating a community based on core leaders that can drive the community as well as Google straight up.

Google is still light years from reaching the well oiled, well organized Microsoft evangelism machine and I hope Don and other will be able to make big leaps to close that gap.

New programming languages forces you to re-think a problem in a fresh way (or why do we need new programming languages. always.)

Whenever a new programming language appears some claim its the best thing since sliced bread (tm – not mine ;-) ), other claim its the worst thing that can happen and you can implement everything that the language provides in programming language X (assign X to your favorite low level programming language and append a suitable library).

After seeing Google’s new Go programming language I must say I’m excited. Not because its from Google and it got a huge buzz around the net. I am excited about the fact that people decided to think differently before they went on and created Go.

I’m reading Masterminds of Programming: Conversations with the Creators of Major Programming Languages (a good read for any programming language fanaticos) which is a set of interviews with various programming languages creators and its very interesting to see the thoughts and processes behind a couple of the most widely used programming languages (and even some non-so-widely-used programming languages).

In a recent interview Brad Fitzpatrick (of LiveJournal fame and now a Google employee) was asked:

You’ve done a lot of work in Perl, which is a pretty high-level language. How low do you think programmers need to go – do programmers still need to know assembly and how chips work?

To which he replied:

… I see people that are really smart – I would say they’re good programmers – but say they only know Java. The way they think about solving things is always within the space they know. They don’t think ends-to-ends as much. I think it’s really important to know the whole stack even if you don’t operate within the whole stack.

I subscribe to Brad’s point of view because   a) you need to know your stack from end to end – from the metals in your servers (i.e. server configuration), the operating system internals to the data structures used in your code and   b) you need to know more than one programming language to open up your mind to different ways of implementing a solution to a problem.

Perl has regular expressions baked into the language making every Perl developer to think in pattern matching when performing string operations instead of writing tedious code of finding and replacing strings. Of course you can always use various find and replace methods, but the power and way of thinking of compiled pattern matching makes it much more accessible, powerful and useful.

Python has lists and dictionaries (using a VERY efficient hashtable implementation, at least in CPython) backed into the language because lists and dictionaries are very powerful data structures that can be used in a lot solutions to problems.

One of Go’s baked in features is concurrency support in the form of goroutines. Goroutines makes the use of multi-core systems very easy without the complexities that exists in multi-processing or multi-threading programming such as synchronization. This feature actually shares some ancestry with Erlang (which by itself has a very unique syntax and vocabulary for scalable functional programming).

Every programming language brings something new to the table and a new way of looking at things and solving problems. That’s why its so special :-)

Google AppEngine – Python – issubclass() arg 1 must be a class

If you are getting the error “”issubclass() arg 1 must be a class”” with Google App Engine SDK for Python on Linux its probably because you are running Python 2.6 (and will probably happen to you when you run Ubuntu 9.04 – 2.6 is the default there).

Just run the dev server under python 2.5 (i.e. python2.5 dev_appserver.py)

Error: “Operation could not be completed (error 0x000006d1)” when adding a Samba based network printer to Vista

If you are getting the following error while adding a Samba based network printer to Vista:

Windows cannot connect to the printer. Operation could not be completed (error 0x000006d1).

And you have a Samba server (version 3.0 and above) consider using the following technique to add the printer:

  • Add a local printer (not a network one!)
  • Select “create a new port”
  • Select “Local port” as type of port
  • In the port name enter the printer’s SMB path, i.e. \\sambaserver\printer_name
  • Select the right driver

That’s all. Works like a charm!

If you have an older version of Samba (< 3.0) know that Vista uses NTLMv2 by default. Follow these instructions to revert back to NTLMv1 by default (also true for regular shares).

Also note that since this is a local printer that prints to a print queue on the Samba server, you might not be able to delete print jobs that were completely sent to the Samba server print queue, since we essentially created a local queue.

“Unable to retrieve MSN Address Book” on Pidgin on Ubuntu / Debian?

Today I got the following error on Pidgin (I’m running version 2.5.2 on Ubuntu 8.10 Intrepid Ibex) while it tried to connect to MSN:

“Unable to retrieve MSN Address Book”

After searching a bit I found this post by Gijs Nelissen which said to use a different MSN plugin for Pidgin called msn-pecan.

I’ll reiterate the instructions for those with Ubuntu / Debian:

  1. Close Pidgin (make sure the process is really down)
  2. Run “apt-get install msn-pecan”
  3. Start pidgin
  4. Change your MSN account type from MSN to WLM
  5. Reconnect

I don’t know if this error affects other libpurple based multi-headed IMs (such as Adium) (UPDATE: It appears this IS a libpurple issue – so Adium IS affected), however, the msn-pecan project has a Windows binary release as well as source release (if you care/need/want to compile it for Mac OS X or other Linux distributions).

Ubuntu 8.10, Dell D630, fan issues and screen repaints issues

On the day of Ubunut 8.10 I’ve upgraded my work laptop (Dell D630) to Ubuntu 8.10. I’ve previously ran my home desktop on the release candidates and saw that all is well so I didn’t expect any specific issues with the upgrade.

After finishing the upgrade successfully I’ve encountered 2 problems.

The first was with the computer fan. It was workin on and off in full steam in 4 seconds cycles. Really annoying. A quick search in the Ubuntu forums led to this post saying I should upgrade to the latest BIOS version (A13 – at least at the time of writing this post).

Upgrading to the latest BIOS stopped the fan from cycling to full speed and full stop but it was still running a bit too much even when the computer was rather idle.

There was another post in the forums that suggested to go back to the older Nvidia drivers (version 173) instead of using the version which ships with Ubuntu 8.10 (177).

That managed to solve the fan issues for now as well as fix some strange repaint problems I was seeing when working with TwinView and extending my screen to another external monitor.

Thought it might help others who face these problems.