Showing posts with label blogging. Show all posts
Showing posts with label blogging. Show all posts

Wednesday, April 6, 2016

Migrating to Blogger

After a protracted quiet period, I'm rebooting my blog. Applause all around (from the 2 people who stuck around all these years waiting for a new post). As for the reasons I failed at blogging the first time around and what I can do to succeed this time, that will be the subject of a future post.

As a first step in the reboot, I decided to migrate my blog off a self-hosted Pebble/Tomcat setup and onto a free hosted blogging platform where I won't have to spend time and effort on maintenance and software/security updates. This will hopefully free me up to just focus on blogging. After an admittedly shallow search, I decided to give Blogger a try this time around. After going through the migration, I thought I would share what I like and dislike about Blogger so far. This list primarily focusses on my migration experience, configuration and first impressions. Given this is my first post on the Blogger platform, I haven't actually used the blogging features much yet. Here goes.

Things I like about Blogger:

  1. It's free to use with a custom domain registered through Google Domains. Using my own domain name was non-negotiable for me. The fact that I could easily do that for free with Blogger + Google Domains (where all my domains are already registered) made it a win over other hosted blogging platforms.
  2. Blogger integration with Google Domains is nice. DNS configuration for my blog was simply a checkbox in the Google Domains admin which created something Google Domains calls a synthetic record. The rest of the the configuration was within Blogger itself.
  3. I can configure redirects. Some of the legacy Pebble urls needed to be changed to be compatible with Blogger. Broken links would have been a deal-breaker for me. Blogger allows you to specify whether the redirects are temporary or permanent. Also, query string parameters are maintained through Blogger redirects, which is a plus.
  4. I can make "pages". ie. static web pages that aren't blog entries per se. e.g. About me. Pages are not dated and are not included in the RSS feed, etc.
  5. The pre-fab templates look pretty good, are mobile-friendly out of the box and there were multiple template choices that suited my minimalist preferences.
  6. Adding widgets to the sidebar was straight forward. I was able to easily add my StackOverflow badge and Twitter feed.
  7. Adding Google Analytics tracking was as simple as pasting my GA ID into a text box in the settings area of the Blogger admin.

Things I dislike about Blogger:

  1. You can't use a naked domain. When I tried, I got this error message:

    I had to change the canonical domain for my blog from the naked domain to www, which wasn't a deal breaker for me although I would have preferred to stick with the cleaner looking naked domain for the canonical url of my blog. At least Blogger provides a checkbox to redirect the naked domain to your chosen subdomain, www in my case. The most likely reason for not supporting naked domains is that a large scale platform such as Blogger probably has all custom domains resolve to CNAME records pointed to load balancers. These CNAMES resolve to the most optimal load balancer for the requested location. The DNS protocol doesn't support specifying CNAME records at the zone apex. Therefore, naked domains can only use an A record which can only point to an IP address. (and optionally an AAAA record pointing to an IPv6 address). Some DNS providers support CNAME-like functionality for naked domains. This is achieved by the DNS service dynamically resolving the CNAME on the fly and responding with an IP address as though it were an A record, thereby making the whole thing transparent to the client. Amazon calls these Aliases and Cloudflare calls it CNAME flattening. Still others call it an ANAME. But Google Domains doesn't seem to have this feature under any name at the moment.

  2. You can't support https with a custom domain.

    Again, this was not a deal breaker for me as my legacy blog wasn't being served over https anyway. But given the recent trend towards end to end encryption on the web, and efforts to provide free SSL certificates such as LetsEncrypt, it would be nice for Blogger to offer a solution, free or otherwise. If I do want to get this blog on https, I might have to migrate off Blogger. (Proxying through Cloudflare might be an option).

  3. Importing my content was a pain. Fortunately, there wasn't much of it. I had only ever published one blog post which received a grand total of 4 comments. Blogger does have blog import/export support for posts and comments, but the only supported format is Blogger's own XML format. Pebble didn't seem to have an exporter for this Blogger XML format so I ended up cut/pasting my blog post, the 4 comments, and the pages I had created into Blogger's web interface. I then used the Blogger export tool to export the XML so I could spoof the authors and dates of the comments directly in the raw XML and re-import it back into Blogger. It took me several rounds to get all the timestamps/timezones right. I ended up hitting a rate limit on the import function, which forced me to wait 24 hours before attempting another import. The URL paths and slugs that Blogger allows for posts didn't line up 1:1 with my legacy blog stack's paths so I had to set up redirects in order to not break any URLs. I also had to backdate the date portion of the path to match the actual date I published the original on the legacy blog stack. This manual tinkering obviously wouldn't have scaled beyond a small number of posts and comments.
  4. The auto-generated sitemap.xml includes blog posts but not "pages". I wanted to add them manually but I can't see any way to edit or override the auto-generated sitemap.xml. I guess the pages I created won't be in my sitemap :( Also, sitemap.xml is being served with Content-Type of application/atom+xml which I'm not sure is valid, although Google's own webmaster's tools considered the sitemap to be valid.
    Update 2106-04-07: Blogger's undocumented pages feed can be submitted to Google Webmaster's Tools as a sitemap in addition to the auto-generated sitemap.xml. Thanks for the tip @prayagverma!
  5. The custom redirects are limited to internal urls. Some of the redirects I wanted to set up were to outside sites such GitHub repos I have created but I wasn't able to make them work. There may be a sound reason this is restricted, but it was still a let-down given my non-sinister use case.

Though it's imperfect, overall Blogger feels like an upgrade from my self-hosted Pebble stack. So I'm pleased. And now it's time to stop messing with the blogging software stack and start actually blogging!

Monday, January 11, 2010

Hello World! - The latest in a long line of things with this name

Of all the "Hello Worlds" I've cranked out as a programmer, this is perhaps the only one where the title has really been apt. Yes, I finally drank the delicious blogger kool-aid and decided to take a leap into blogdom. I guess this means I'll start using words like blogosphere, blogophilic and blogophobic. If I'm really successful, I'll even coin a new term in this blog. Here, let me try... ...It has to be a term that gets zero results when I google it. Ok, got one:

ablogual
An individual who neither blogs nor reads blogs. Not a member of the blogosphere due to either ignorance, lack of interest, a superiority/inferiority complex, or blogophobia. Sample usage: "That dude is totally ablogual. He's never even heard of Joel Spolsky."

Blogging, the sane approach vs. the Asaph approach

The sane approach to starting a new blog is to use one of the many fine blogging software packages freely available on the web so one can focus on writing insightful, witty drivel without getting bogged down in re-inventing the wheel. Being a programmer, I was of course tempted to embark on the largely pointless task of building yet another blogging engine. I actually started down this yak shave thinking it would be only minimally distracting.

Writing a blog engine is the new Hello World.

-- Jeff Atwood

Blogging software, just a bunch of static HTML pages, right?

Like many programming tasks, writing blogging software is deceptively simple at first glance. Being new to blogging, I thought: "How hard could it be?" The truly lazy would just roll a static HTML page for each blog entry. Right? Being slightly less lazy, I made my blogging software in Java and database driven (Are you impressed?). I got the basics for that up and running relatively quickly. After a few evenings of programming I even had a functional back-end admin console. Then it occured to me that user comments would be essential. The people must be heard! But I would have to prevent comment spam somehow. Being selectively pragmatic, I decided I'd be willing to deal with that manually in the near term. One yak shave lead to another and before I knew it, I was spending time writing a Gravatar URL generator and an HTML source code syntax highlighter.

But those were just the features I could think of. The real kickers were the features I didn't know I needed because I was a blogging noob. Turns out, there is a secret XML undercurrent that moves information around the blogosphere. I don't want to miss that ride. Every blog needs an RSS feed (maybe ATOM too), pings (but not spings), pingbacks, trackbacks, linkbacks, refbacks and smackbacks. Ok, some of those are redundant and at least one is made up. But the point is I underestimated the task. That's actually not surprising at all since as a programmer, I routinely do that. If I ever give you an estimate, go ahead and double it.

After about a month, I came to my senses and decided that I should at least bootstrap my blog with an existing free solution while I roll my own. Honestly, the project is shelved indefinitely. I haven't admitted defeat but I've become keenly aware that it's probably a waste of my precious time.

Why re-invent the wheel when open source wheel libraries already exist?

The obvious choice to start is Wordpress. It's mature, solid, scalable, supported by a thriving community, has a bazillion plugins, and it appears to be the leading open source blogging solution. It's written in PHP, which while not my first choice, is a language I'm very comfortable with. I test drove it on my laptop. The install was a breeze. I simply unpacked the archive, created a MySQL database for it, set some easy config parameters and fired it up. Hitting the site in a browser for the first time prompted me to create the database schema and with that the install was pretty much complete. It worked well right away and I was ready to install it on my production server. Only one thing stood in the way. My server runs Tomcat on port 80 and Wordpress runs on Apache. There is no shortage of instructions online for proxying requests from Apache to Tomcat but almost nothing for doing the reverse. The standard advice used to be to run Apache on port 80 for static content and proxy requests for dynamic content to Tomcat running on port 8080. But since Tomcat now serves content at speeds that compete with Apache, I don't think the setup makes sense for my servers. So I set up Apache on port 8080 and set off on the task of getting Tomcat to proxy requests to Wordpress. After writing a simple proxy servlet, I discovered that Wordpress breaks this scheme by detecting that the page was served from a non-canonical URL and issues a redirect. I started digging into the Wordpress internals only to find a total mess of thousand line long non-object-oriented PHP scripts. I tried tweaking the siteurl and home entries in the options table of Wordpress's MySQL database but I couldn't get them quite right. Going with an open source solution was supposed to save time, so it was at this point that I decided to stop short of tweaking Wordpress's ugly PHP code.

The obvious next step was to look for the Java community's answer to Wordpress. This search lead me to Apache Roller. The install was very similar to Wordpress; Set up a MySQL database, deploy the code, edit a configuration file, and proceed the rest of the way with a browser based install. I didn't like having to stick an Apache Roller properties file in Tomcat's lib folder. I think an application's configuration should be in the app's lib folder, not the webserver's lib folder. but after coming this far, I was willing to live with this annoyance. The other thing I noticed about Roller on my laptop is that it seemed sluggish. The web based install actually took over a minute to complete. I tried to just put it out of my mind. After all, Apache Roller is supposedly robust and scalable enough to run Sun Microsystems' blogs. So I went ahead and installed Roller on my 2 production servers which happen to be a couple CentOS virtual machines with a paltry 256 megs of ram each. One of the servers handles all my live web traffic and the other serves as a database server and failover web server. I immediately noticed the load on the primary web server server higher than before. I hadn't even posted a blog entry yet. A couple of days later, I started seeing OutOfMemory Exceptions in Tomcat's logs and getting error reports from users for one of my high traffic sites hosted on the same VM. I hadn't had this problem before installing Roller so I immediately pulled the plug and uninstalled Roller. The memory issues seemed to be immediately and permanently resolved. I imagine that Roller, which appears to include every Jakarta subproject under the sun in its lib folder, needs a lot more horsepower than my dinky CentOS VM or my MacBook Pro could provide. If anyone has anymore insight, please leave a comment.

So with that, the search for a lightweight java based alternative to Apache Roller and Wordpress began. This lead me to Pebble, an open source java blog server supporting all the basic features I was looking for. After the Wordpress/Roller excercise, I was expecting an uphill battle. Unlike the other 2 blogging engines I tried, Pebble doesn't use a MySQL database for its backend. Instead it's based on Lucene which theoretically should make the search function of my blog perform better than Wordpress or Roller which would presumably have to rely on MySQL's rather weak full text search capabilities. I did end up tweaking my Pebble instance a little bit. While customizing robots.txt, I discovered it was being served with a Content-Type: text/html instead of text/plain. It looks like the bug is due to the fact that .txt files are set to be parsed as jsp files and are getting the default jsp Content-Type. I worked around this by adding <jsp:directive.page contentType="text/plain" /> to the top of robots.txt and it solved the problem. I should probably submit a patch for that fix. So I held my breath and installed Pebble on my production servers. Unlike with Roller, they seemed to be unaffected, which is of course a good thing. Finally, I was ready to start blogging! Maybe... Hopefully... Meh. I'm sure something will go wrong...

So I've started a blog. Now what?

So what can you expect from yet another programmer's blog?

  • code
  • tips & tricks
  • marginally witty commentary about programming
  • humerous stories about code gone bad
  • programming related rants
  • book reviews
  • web site reviews
  • warnings about quirks and strange behavior that I encounter in various APIs or software packages
  • shameless self promotion
  • fawning over my favorite technologies
  • bragging about projects I'm working on
  • miscellaneous

Traditionally, Hello World! is supposed to be just a few lines. It should certainly all fit above the fold. But economy of expression is overrated and verbosity is a programmer's friend. This has been a little long but not a bad start. Ok, now that Hello World is finished, it's time to get some real work done.