Upgrade PostgreSQL on Scientific Linux

We updated two database servers this weekend, one from postgresql 9.1 and the other from 9.2 and brought both of them to 9.3. What follows are my combined process notes in the hopes that it will help you.

Preparation

To do this, you must have enough free disk space on your data drive to make a duplicate of the existing cluster (that is, all databases hosted on the server). So for example, our data drive on one server had 55% usage and I had to clear it to 50% (the drive is dedicated to database storage). On the other server it was 66% consumed. In both cases I removed files that were unrelated to the cluster (backups and WAL archives) and moved them off-server. In the case of the second server this wasn’t enough. If you can easily install or mount a new drive, that’s much easier than these steps but we didn’t have that luxury.

You can free up disk space by re-indexing the databases, running vacuum full, or dump and restore. Re-index can be done without taking the database offline. The other two require taking down the database and may take hours or days for a multi-gigabyte cluster. Restoring the database from backup file took 18 hours on a 250GB database (13GB gzipped, pg_dump backup file) and 39 hours for our 450GB cluster (25GB backup file). From everything I’ve read, for databases in the hundreds of gigabytes and larger, vacuum full will basically take forever. It’s faster to dump and rebuild the database.

However, you can recover a significant amount of space by re-indexing. We recovered 100GB of our 600GB cluster by running re-index on each database. Note that this took 3 hours for one 260GB database and 4.5 hours for a different 250GB database. The major difference of the two was the latter had older data — so the indexing was more fragmented.

    sudo su - postgres -c 'psql app-database-production'
    REINDEX DATABASE "app-database-production";

Instructions

We’re using Scientific Linux. The PostgreSQL Global Development Group has made a repository of builds available for binary distributions. You install the repository by installing the RPM for the repository (this was weirdly meta for me but it works). This creates a pgdg93 repository. See the repository packages page for more links.

    sudo rpm -ivh http://yum.postgresql.org/9.3/redhat/rhel-6-x86_64/pgdg-redhat93-9.3-1.noarch.rpm

You can then install the new PostgreSQL 9.3 releases:

    sudo yum install postgresql93 postgresql93-devel postgresql93-libs postgresql93-server postgresql93-contrib

postgresql93-contrib is only needed for the pg_upgrade tool we’re going to use. You can remove it after the upgrade if you want.

Make a database folder

Create the new database data folder as postgres user. The Scientific Linux distro puts postgresql into /usr/pgsql-VERSION. The default data file location is /var/lib/pgsql/VERSION/data, although ours is mounted on a separate drive.

    sudo su postgres -c '/usr/pgsql-9.3/bin/initdb -D /var/lib/pgsql/9.3/data'

Disable database access

Stop all connections to the database and disable your web application. This next phase can take several hours so you’ll want to make sure you have time. Our 845GB cluster took a little over 2 hours of server downtime.

In our case, closing connections meant stopping resque workers that we have managed by monit, and disabling the web applications with capistrano maintenance mode. We also stop monitoring the database postmaster process to ensure that monit doesn’t restart it while we’re doing the upgrade. Obviously these are meant to jog your thoughts, your own infrastructure will look different.

    worker-server$ sudo monit -g resque-workers stop

    database-server$ sudo monit unmonitor postmaster
    database-server$ sudo /etc/init.d/postgresql-9.1 stop

    dev$ cap production deploy:web:disable REASON="a scheduled system upgrade" UNTIL="at 11pm Pacific Time"

Run the upgrade

Run the new pg_upgrade to migrate from the old version (-b,-d) to the new version (-B,-D). This is the part that takes a couple hours per server.

    sudo su postgres -c '/usr/pgsql-9.3/bin/pg_upgrade -B /usr/pgsql-9.3/bin -b /usr/pgsql-9.2/bin -D /var/lib/pgsql/9.3/data -d /var/lib/pgsql/9.2/data'

Verify the new cluster

Manually inspect the differences between the startup scripts:

    diff /etc/init.d/postgresql-9.?

Transfer any important things to the 9.3 script and remove the 9.2 one. In our case we have a custom PGDATA setting.

Similarly compare the pg_hba.conf and postgresql.conf files in the old data directory with the new ones. The postgresql.conf can be tedious if you’ve done a lot of tuning. (p.s. Anyone know of a good diff tool for configuration files that can compare uncommented lines in either version with their commented pairs in the other?)

    diff /var/lib/pgsql/9.2/data/postgresql.conf /var/lib/pgsql/9.3/data/postgresql.conf
    diff /var/lib/pgsql/9.2/data/pg_hba.conf /var/lib/pgsql/9.3/data/pg_hba.conf

Start the new postgresql and analyze the new cluster to optimize your database for the new version. (Note that analyze_cluster.sh is installed into the working directory when pg_upgrade is run.) The analyze script has three phases. The minimal one will get the database up and running in a couple minutes. So you can bring things back online at this point or wait until it’s fully complete.

    sudo /etc/init.d/postgresql-9.3 start
    sudo su postgres -c ./analyze_cluster.sh

Bring things back online

If you’re running monit (or god or something) to manage your postgresql server, you’ll need to modify the script with new references.

Now, bring everything back up online that you disabled earlier.

    database-server$ sudo monit reload

    worker-server$ sudo monit start all

    dev$ cap production deploy:web:enable

Test that things are working with everything up and running.

Clean up

If you’re satisfied with the new system, you can delete the old cluster. This script is installed in the working directory that you ran pg_upgrade from.

    sudo su postgres -c ./delete_old_cluster.sh

If you’d rather be a little more careful (after all it only copied the database files over) you can delete the old data/base folder, which is the bulk of the storage, and keep other configuration files around in case you need to recover them.


References:

1. How to install PostgreSQL 9.2 on RHEL/CentOS/Scientific Linux 5 and 6
2. pg_upgrade
3. REINDEX
4. How to optimize PostgreSQL database size

Advertisements

StackOverflow has 25% unanswered questions

We were talking the other day at Ada Developers Academy about whether StackOverflow has an increasing barrier to entry. In order to play you have to answer a question. This was easier when it first started (and when it first started I had lots of experience). In the last few years many, many of the beginner questions are already answered. And for beginners, it’s even harder to find questions you can answer. The students I was talking to were frustrated by this.

At the same time, I was lamenting the fact that the few questions I’ve asked don’t get answered. Maybe because I only ask questions I can’t find the answer to and that are really hard. Or maybe they are uninteresting.

So I checked and discovered that this morning StackOverflow lists 1,687,405 unanswered questions of 7,016,543 total. That’s 25% unanswered questions! This really surprised me.

So I’m sure there are questions for beginning programmers just by the shear volume of questions. Finding them may be a challenge.

[Update 2014-05-12]

There’s an interesting post on meta.stackoverflow.com where the top responders seem to think that the quality of questions is getting lower and more repetitive. One responder to this meta questions decried the questions on SO as being “answered 100 times, or is a “do my work for me” question.” So from their perspective the reason 25% are unanswered is because they have been answered before or the requester hasn’t done sufficient pre-work to warrant asking a question.

Sorry coders, graphic design is 75% of the game

We spend a lot of time making our software well architected, maintainable, bug free and well written. It’s important to us that we build software that we can be proud of and that we can scale and maintain (alternately, you can take Fred George’s extreme position of building everything on throw-away microservices, but that’s a different post). In fact, in a survey of my team we found that companies we have worked in often have five to seven developers per graphic designer. That implies that it takes a lot more development effort that graphic design effort to get the job done.

My experience is, that from the customer’s perspective, great graphic design can cover a multitude of sins. A site that looks great will have customers apologizing for bugs. A site that looks ugly or unbalanced will have people looking for bugs. Whether we like it or not, customer opinion is predicated more on graphic design than all the work we spent making the software fast and robust.

Here’s a story I tell to illustrate this (I’m sure you have your own similar experience).

When I bought my house I knew I had to replace the furnace — it was a gas furnace from the 20’s that took up more square feet than a bathroom. I had a company install a top-of-the-line energy efficient gas furnace, inside the crawlspace, which meant re-building all the duct work in the basement. When they had completed the job, I saw that they had used three different types of duct tape throughout the basement. This immediately got me looking at other issues — such as the dampers they had not installed, seams that were either not taped or not sealed, and other minor issues. I had them come back and complete the job; but if they had been consistent in their duct tape use, I probably would have let it go at just having the dampers installed and maybe even let that be. The fact that things were inconsistent made me concerned about the quality of the work and look for more problems.

The corollary to this story is when I had my house re-plumbed with all copper pipes. The plumbers did a beautiful job, sweating clean, perfect joints and getting into the tight areas in the house with a minimal amount of holes. I was very impressed with the work. (As was the inspector: “Ken did this job? Ok. You’re all set, then. Bye.”) Even though it took two days to really flush the flux out of the pipes I didn’t mind because it looked so pretty.

That doesn’t mean we’re not putting a lot of effort into robust, responsive, maintainable code. But it does mean that we try to have damn good graphics design.

Why your back-end tools should be sexy, too

If your internal tools aren’t the same quality and sexiness as your client-facing tools, then your employees aren’t going to be as excited as they could be and won’t be selling your company as well as they could.

While working at Bookr, our design motto was “dead simple; dead sexy.” It’s something that I think is a great, simple goal that most everyone in the company can eat least target. We applied this to our product and customers loved the way it looked. I’ve carried that through (internally) for Blueprint, although I don’t think that “sexy” quite fits our brand here as it did for Bookr.

On almost every project I’ve worked on, we put a lot of effort on client-facing parts of the product and tools that are used internally (often called “admin” screens) just don’t get the same treatment.

Have you ever called up a company on the phone and had to wait while the representative has to navigate an extremely complex system or set of systems to get your information? Ok, have you ever called a company at not had that experience? Doesn’t that frustrate you as a customer? Quite often, I’ve had representatives apologize for the poor quality or capabilities of their software. As a software developer, I don’t want this ever to happen with my products.

So in the last two companies, I’ve been working to promote as much, or nearly as much, effort on the design and thoughtfulness and usefulness of “admin” screens as I do on the client facing stuff.

For Blueprint, not only are our clients using the product directly, but account managers and analysts use the product for consultative services. They get the benefit of client-facing tools, but when they cross into the “admin” parts of the site I don’t want them to feel put off.  We are constantly looking at the tools we have built for managing, tracking and maintaining client accounts and client information — tools that clients don’t have access to — and they way we use those tools to try to make sure they can be simple and sexy.

Why you maybe shouldn’t praise employees’ talent

In a series of studies Carol Dweck (with C. M. Mueller, 1998)[1] found that praising students’ abilities conveys “that their ability is a gift and makes them reluctant to take on challenging tasks that hold a risk of mistakes.”[2]

I recently read Why Aren’t More Women in Science?, which a was surprisingly interesting, especially as learned scientists and researchers come to different conclusion to answer the question posed in the title. There is a lot of interesting information in there about behavioral science and I recommend reading it if you are interested in technology, women in technology, and the recent push for STEM instruction in schools.

However, I found the results of Dweck’s 1998 study to be surprising and contradictory to almost all advice in both teaching and management. What Dweck discovered though, actually makes sense. If you praise an individual for an innate ability (talent, intelligence, etc.), then she may be reluctant to risk losing your esteem in her that you have suggested is related to something that she cannot change.

This doesn’t mean you should’t praise people though. I think that what Dweck suggests is that you should praise people for what they’ve accomplished — for a job well done — rather than suggesting that they are naturally good at it. As I’ve tried to look at my practice of praise (which I continually work on) I find that this is hard, because I want to recognize individual strengths (First Break All the Rules was my first management bible).


[1] Mueller, C. M., & Dweck, C. S. (1998). Intelligence praise can undermine motivation and performance. Journal of Personality and Social Psychology., 75, 33-52.
[2] Dweck, C. S. (2007). Beliefs That Put Females at Risk. Why Aren’t More Women in Science? Washington, D.C.: American Psychological Association.

Emotional search with Twitter

I’ve been reading the Twitter API docs and just discovered that Twitter lets you search based on the tone of the tweet using emoticons. For example, type this search:

#829strike :)

Twitter returns tweets about the Fast Food strike on 8/29  with a positive attitude.

If you want to see the other side of the the emotions, use a frowny face.

#829strike :(

This works on Twitter Search as well as through the API.

Using emoticons as operators to represent attitude is pretty cool.

Yes, it turns out Windows 8 is relevant

Meeker 2012 p24Here in Seattle among small tech companies (or start-up), and among pundits worldwide, people like to say that Windows is irrelevant. They point that slide (#24) in Mary Meaker’s State of the Internet last year (or slide #109 in this year’s deck, although it looks like that’s missing the Apple segement) that shows that Microsoft dominated computing platforms for 20 years and is being replaced with iOS and Android. They say that Windows 8 isn’t important and the Windows Phone 8 will never gain the market share that Apple and Android have.

I’m sure that there is lots of research to backup these claims (Gartner and IDC chimed in after all), but flat design is pervasive. The big news this week out of WWDC is that Apple’s iOS 7 has moved to flat design. Android 4 dropped skeuomorphic elements 18 months ago, even changing the text box to a single line. The latest Gmail for Android looks like Windows Phone 7, with quartered avatars that are incredibly reminiscent of the Metro tiles.

Zune 2007All that started at Microsoft. The elements that Microsoft built on for Metro were hatched in the Zune “chromeless” interface Method designed in 2007, carried into Windows Phone 7 and then Windows 8. The Metro interface has gotten tons of not-very-friendly press, but it’s clearly made an impact on the world.

So when the entire world of design is changed based on ideas coming out of Windows and Microsoft, I find it disingenuous to claim that they are not relevant. I couldn’t say whether Windows 8 is selling well enough for Microsoft, but it’s certainly making it’s mark.