5 tips on future proofing your Medium posts

So, you’ve decided you want to blog or write on Medium – where all the cool kids hang out. Great. Remember there are other similar platforms to write and blog and at some point Medium (like everything else on the Internet) might lose its appeal or even, god forbid, shutdown.Dont break da internetz

When that happens, what will happen to your posts? How can you and the rest of the internet reach it?

There were numerous occasions in the past (the most recent one was Posterous) where the platform simply died and all the links pointing to it and all of their SEO goodness went to shaite.

You can always run your own server, but not everyone has the time, power or know-how to do so.

Here are few tips to help you be forward compatible with most platforms in the future:

  1. Choose a platform that supports a custom domain. If you can’t blog under your own domain you will never truly own your content.
  2. Make sure to blog under your own domain (or subdomain such as blog.mycooldomain.com). If you don’t own a domain – get one. It’s very cheap as it costs anywhere between ~$3-$10/year depending on the type .com, .net, .co and domain registrar (I like namecheap.com). Medium added support for that on March 2015 so you have no excuse.
  3. Make sure you have some sort of backup for your posts. I usually like to write my posts without formatting on Google Docs or Simplenote so I get an immediate backup. Before publishing I copy the content to the publishing platform I use. Doing so will make sure that even if your platform goes down you can always restore your posts using your domain and back up of the posts. Another benefit is that you don’t need to rely on an export feature that your dying platform may or may not provide.If you are slightly more technically advanced I suggest writing the posts in Markdown. There are various tools to generate better looking HTML that you can later paste into your current cool blogging site.
  4. Make backups of attached/uploaded resources. If your posts contain images or other resources that you have uploaded to your chosen platform, make sure to have copies of these files so that you can always restore it with the post text in other platforms if needed.
  5. Save all of your posts’ URLs. Make a document or spreadsheet of all your posts’ URLs. For example, if a blog post URL is http://mycooldomain.com/2015/12/31/something-cool make sure to copy and save it. If your chosen blogging platform goes down you can always add a redirect rule from the old URL (as it appeared in the old platform) to how it will appear in the new one, thus not breaking da internetz!

These rule don’t apply just to Medium, they apply to most platforms such as WordPress (wordpress.com or self hosted one), any static site generator, Ghost, Svbtle, Squarespace, Weebly, etc.

In my opinion, the best choice nowadays for people who don’t want to mess around with server is to use a static site generator such as Jekyll. While it involves running a few commands in your shell (that black screen with running white text) you can easily build a site, generate it and host it on platforms such as Surge or Netlify.

Disco Tip – Crunching web server logs

At my day job we use Disco, a Python + Erlang based Map-Reduce framework, to crunch our web servers and application logs to generate useful data.

Each web server log file per day is a couple of GB of data which can amount to a lot of log data that needs to be processed on a daily.

Since the files are big it was easier for us to perform all the necessary filtering of find the rows of interest in the “map” function. The problem is, that it requires us to return some generic null value for rows that are not interesting for us. This causes the intermediate files to contains a lot of unnecessary data that has the mapping of our uninteresting rows.

To significantly reduce this number, we have started to use the “combiner” function so that our intermediate results contains an already summed up result of the file the node is currently processing that is composed only from the rows we found interesting using the filtering in the “map” phase.

For example, if we have 1,000 rows and only 200 answer a certain filtering criteria for a particular report, instead of getting 1,000 rows in the intermediate file out of which 800 have the same null value, we now get only 200 rows.

In some cases we saw an increase of up to 50% in run time (the increase in speed is the result of reducing less rows from the intermediate files), not to mention a reduction in disk space use during execution due to the smaller intermediate files.

That way, we can keep the filtering logic in the “map” function while making sure we don’t end up reducing unnecessary data.