Windows 8

It’s Sunday morning, and I’ve got nothing much to do, other than wait for Hurricane Sandy to hit, so I thought I’d catch up on blogging. I have a few things I want to write up, the first being some thoughts on Windows 8. (I’ve found a couple of good reviews/articles on Win 8 at The Register and ComputerWorld.)

I pre-ordered a boxed copy of the Windows 8 upgrade from Newegg, and I’d planned on using that to upgrade my ThinkPad from Windows 7 to 8 this weekend. However, it hasn’t arrived yet. I then thought about just downloading the $40 upgrade from Microsoft and using the boxed copy to upgrade my desktop at some point. I went as far as running the upgrade advisor on the ThinkPad, but the results I got made me back off on that plan and rethink things a bit.

Specifically, Visual Studio 2010 is listed as “not compatible”. I was pretty surprised at this, since I would expect that MS would want developers to be able to move to Win 8 early. I realize that they’d also like to see developers move to VS 2012, but they must know that not everyone can do that right away.

So, I’ve been thinking about my options. One option would be to just do a clean install of Windows 8 on the ThinkPad, and not worry about VS 2010. I do like having it available, but the ThinkPad isn’t my main machine, so there’s no reason I really need it to have VS 2010. Another option would be to just try the upgrade and see what happens. This guy has apparently had some luck with VS 2010 on Windows 8, so maybe it’ll work, even if it’s marked as “not compatible” by the upgrade advisor.

Another interesting thought I’ve had, after reading about how awesome Hyper-V is on Windows 8, is to have a fairly vanilla Win 8 install on the ThinkPad, then have VS 2010 and some other stuff set up in a Win 7 VM. (There are good articles on Hyper-V support in Windows 8 here and here.) Of course, then I need to have a Win 7 license that I can use in a VM. In the past, I’ve learned the hard way that you can’t reuse an OS license from a physical machine from a major OEM in a VM — it detects that you’re not on actual hardware from that OEM, and locks you out. I’m not 100% sure if that’s still the case, but I’d bet it is. So I can’t just use the ThinkPad Win 7 license in the VM.

I think I have a Win 7 product key from my old MSDN subscription, from my previous employer, but that subscription expired a couple of years ago, and I’m not sure if the product keys would still be valid. Which then brings up a bigger question that I’ve been putting off thinking about: Is it time for me to break down and finally buy my own MSDN subscription, or TechNet subscription? TechNet is affordable enough, but MSDN costs about as much as a new laptop would. I like being able to mess with VMs and experiment with new stuff from Microsoft, but the cost of doing so if somewhat prohibitive.

geocoding experiments

I wrote an initial blog post on Gisgraphy about a month ago. I wanted to write a follow-up, but hadn’t gotten around to it, what with all the other stuff I have going on. But I’m going to take a few minutes now and write something up.

The initial import process to get the Gisgraphy server up and running took about 250 hours. The documentation said it would take around 40 hours, but of course there’s no way to be accurate about that kind of thing without knowing about the specific hardware of the server it’s being installed on, and the environment it’s in. I’m guessing that, if I had more experience with AWS/EC2, I would have been better able to configure a machine properly for this project.

Once the import was complete, I started experimenting with the geocoding web service. I quickly discovered something that I’d overlooked when I was first testing, against his hosted web service. The geocoding web service takes a free-form address string as a parameter. It’s not set up to accept the parts of the address (street, city, state, zip) separately. It runs that string through an address parser, and here’s where we hit a problem. The address parser, while “part of the gisgraphy project,” is not actually open source. An installed instance of Gisgraphy, by default, calls out to the author’s web service site to do the address parsing. And, if you call it too often, you get locked out. At which point, you have to talk about licensing the DLL or JAR for it, or paying for access to it via web service.

Technically, the geocoder will work without the address parser, but I found that it returns largely useless results without it. For instance, it will happily return a result in California, given an address in New Jersey. I’m not entirely sure how the internal logic works, but it appears to just be doing a text search when it’s trying to geocode an un-parsed address, merely returning a location with a similar street name, for instance, regardless of where in the US it is.

While I don’t think the author is purposely running a bait-and-switch, I also don’t think he’s clear enough about the fact that the address parser isn’t part of the open source project, and that the geocoder is fairly useless without it. So, we shut down the EC2 instance for this and moved on the other things.

Specifically, we moved on to MapQuest Open, which I was going to write up here in this post, but I need to head out to work now, so maybe another time.

almost done

I just got back from my parents’ old house in Whiting. I threw out three bags full of stuff, and carted away four plastic tubs full of random stuff, three of which came home with me and one of which has been dropped off at my storage unit in Bridgewater. The house is now as empty as I hope it needs to be. The closing, as far as I know, is still scheduled for Wednesday. I think I still need to make one more trip to Whiting, to drop off my keys and some paperwork at the Cedar Glen Lakes office, but I don’t think I’ll need to go back to the house again. It feels weird, after having it on the market for two years, to finally be (almost) done with it.

Two Man Gentleman Band

I heard these guys on Soundcheck yesterday. They were pretty funny. Worth a listen, if you like funny songs about food and booze and stuff!

Gisgraphy

My boss stumbled across a project named Gisgraphy recently. A big part of what we do involves the need for geocoding. We have generally been using geocode.com for batch geocoding, but there’s a cost to that, and they only do US and Canada. There are many other geocoding services, but if you’re doing heavy volume, you’re generally excluded from free options, and the paid options can get expensive.

Gisgraphy is an open source project that you can set up on your own server. It will pull in data from freely-available sources, load it all into a local database, then allow you to use a REST web service to geocode addresses. A little testing with some US addresses leads me to believe that it’s generally accurate to street level, but not quite to house level. So, I’m not sure that we’ll want to use it for all of our geocoding, but it ought to be generally useful.

We decided to set it up on an AWS EC2 instance. We started messing with EC2 VMs for another project, and it seemed like EC2 would be a good fit for this project too. I started out with a small instance Linux VM, but switched it to a medium instance, since the importer was really stressing the small instance. I will probably switch back to small after the import is done. That’s one nice thing about EC2: being able to mess with the horsepower available to your VM.

Gisgraphy uses several technologies that are outside my comfort zone. I’m primarily a Windows / .NET / SQL Server guy, with a reasonable amount of experience with Linux / MySQL / PHP. Gisgraphy runs on Linux (also on Windows, but it’s obviously more at home on Linux), so that’s ok. But it’s written in Java, and uses PostgreSQL as its back-end database. I have only very limited experience with Java and PostgreSQL. And, of course, I’m new to AWS/EC2 also.

So, setting this all up was a bit of a challenge. The instructions are ok, but somewhat out of date. I’m using Ubuntu 12.04 LTS on EC2, and many things aren’t found in the same places as they were under whatever Linux environment he based his instructions on. For the sake of anyone else who might need a little help getting the basic setup done under a recent version of Ubuntu, I thought I’d list out a few pointers, where I had to do things a bit differently than found in the Linux instructions:

  • Java: I installed Java like this: “sudo apt-get install openjdk-6-jdk openjdk-6-jre”.
  • And JAVA_HOME should be /usr/lib/jvm/java-6-openjdk-i386/ or /usr/lib/jvm/java-6-openjdk-amd64/.
  •  PostgreSQL: I installed the most recent versions of PostgreSQL and PostGIS like this: “sudo apt-get install postgresql postgresql-contrib postgis postgresql-9.1-postgis”.
  • Config files were in /etc/postgresql/9.1/main and data files were in /var/lib/postgresql/9.1/main.
  • PostGIS: In his instructions for configuring PostGIS, the “createlang” command wasn’t necessary. 
  • And the SQL scripts you need to run are /usr/share/postgresql/9.1/contrib/postgis-1.5/postgis.sql and spatial_ref_sys.sql.

That’s about it for now, I think. I want to write up another blog entry on Gisgraphy, once I’ve got it fully up & running. And there might be some value in a blog entry on EC2. But now I have to get back to finishing my laundry!

Stumbling my way through the Drupal API

I’ve had to fix some interesting problems at work recently, related to a Drupal site that we’ll be rolling out soon. I just finished fixing one issue that, while seemingly minor, took quite a while to figure out.

I’m really glad to have come up with a good solution. The thing that amuses me most about this is that, after more than eight hours of messing around, the final solution involved writing only about a half-dozen lines of code.

The problem, in a nutshell, is that we have a content type in the system that represents a university. There’s a location field on each node with, minimally, city and country specified. The user can search for locations, using some custom search code, and the search results are displayed via a standard Drupal view. We allow the user to sort the results by one of a few different fields, with the sort drop-down exposed from the view. This works fine, except when sorting by country. On the location record, only the two-letter country code is stored, for instance “AE” for “United Arab Emirates”. So, when you sort by country, it’s really sorting on country code, so “AE” goes to the top, which isn’t really what the client wanted.

Of course, the first thing I did was Google the problem. I found this issue discussion, which pretty much matches my problem. There was a suggestion in the comments there about using hook_views_pre_render to re-sort the results right before displaying them. That works great, if you’re not paging results. But, if you’re pulling results back one page at a time from a large result set, this doesn’t work, since the pre-render hook only gives you the current page.

So I figured out that I really need to sort by country name at the SQL level, while retrieving results. This led to my next problem, which is that, even with the location module installed, there’s no SQL country lookup table in Drupal. The list of country codes and names is just stored in code, in an array, which can be retrieved via _country_get_predefined_list. (You shouldn’t call that directly, though, of course; you should use country_get_list.)

So off I went to find a module that could give me a SQL table with country info in it. The countries module does that, and a bit more. So, I installed that and figured out where the country table was. Then, my next blind alley was figuring out how to join to the new country table in a view. I was hoping I could just add a join to it in the view definition, and go from there. Well, I still don’t know that much about Drupal views, and it didn’t seem possible to do that easily.

So, the next blind alley was to see if I could alter the view SQL with hook_views_query_alter, which seemed sensible. Well, the query object that you get from that hook isn’t a nice simple query object that can easily be changed, so that turned out to be another dead end. (It’s likely possible that I could have figured it out, but it seemed like the wrong approach.)

Then, finally, I stumbled across this SO question. The one answer posted there led me in the direction of modifying the query with hook_query_alter, which can be used to modify just about any query Drupal issues to MySQL. So, finally, I found a workable solution.

hasAllTags('views', 'views_university_search')) {
    $ord =& $query->getOrderBy();
    if (array_key_exists('location_country', $ord)) {
      $query->addJoin('INNER', 'countries_country', 'cc', 'cc.iso2 = location.country');
      $ord = array('cc.name' => $ord['location_country']);
    }
  }
}

So that’s it. I add a join, and replace the ‘order by’ clause. About a half-dozen lines of code. Oh, and I now also understand passing by reference in PHP a little better too!

Museum visits

I got on a bit of a museum kick last August, and I seem to be doing the same thing this August. I went to the Met last weekend. I was going to go to the Frick and Whitney today, but my nearly deaf cab driver misunderstood my destination, and dropped me off much closer to the Met than the Frick, so I decided to just go with it, and visited the Met again. This is fine, as there is so much stuff in the Met that you can go twice in two weeks and see completely different stuff.  This time around, I stumbled into the Degas section, and spent some time browsing around in that neighborhood.

Last weekend, I took a cab to and from the museum. This weekend, I took a cab up, but walked back to Penn Station, which is a nice long walk. My ankles and knees hurt a little now, but I made the walk without any grief, so that makes me feel a little better about my current fitness level.

fun with WSDL and CURL

Ever since the debacle described in this blog post, I’ve made it a point to double-check the WSDL on the SOAP web services for our main product, any time I’m doing a non-trivial rollout, even if I know I haven’t changed anything that should affect the WSDL.

Up until today, I’ve always just done it by bringing up the WSDL URL for each web service in Firefox, and saving it to a text file. There’s only a half-dozen web services, so it doesn’t take that long. But this morning I finally broke down and wrote a batch file to fetch them all, using cURL.

I’ve gotten a bit more enthusiastic about using cURL, and other tools, to simplify things for me recently, since reading this blog post by Scott Hanselman.

random PHP functions

I’m still doing a fair amount of PHP work. Right now, it’s all Drupal, for a site we’re rolling out very soon. I keep stumbling across random PHP functions I hadn’t heard of before, and that turn out to be nice little time savers. Two examples:

  1. curl_setopt_array: I used to just call curl_setopt() a bunch of times to set all my options Now I can set a bunch of options in one fell swoop.
  2. http_build_query: Nothing I couldn’t previously do with simple string concatenation, but this is much cleaner.

WIndows 8, Mountain Lion, and Ubuntu 12

I have to do a 10pm web site rollout tonight, so I find myself at home with some time to kill. I haven’t gotten much of a chance to play around with Windows 8, so I decided to download the 90-day eval, and install it on my old laptop. I have the ISO downloaded and ready to go now. However, I had installed Ubuntu 11 on the laptop back in February. I haven’t really played around with it much since then, and I was ready to wipe it out, but when I turned it on, I got an update message letting me know that I could update it to Ubuntu 12.04 LTS. Well, I decided I’d rather upgrade the Ubuntu install on this laptop rather than wiping it out and starting over with Windows 8. It’s running now, and seems to be chugging along smoothly.

I did a little searching, and it looks like 12.04.1 was only just released. There’s an article about it on ZDNet, dated yesterday. And I guess the original 12.04 release was a few months back, based on the date on this Lifehacker article.

There’s been a lot of OS-related news lately, with Mountain Lion just released and Windows 8 nearing general availability. My old 2007 MacBook can’t handle Mountain Lion, so I’m sticking with plain-old Lion on that for now. I’m tentatively planning to buy myself a new MacBook Pro early next year, but I’m not really that worried about it right now. And I’m curious about Windows 8, but not that enthusiastic about it, given what I already know. I read an interesting CNET article this morning, comparing Mountain Lion and Windows 8. I think I agree with his conclusions, for the most part.

I will likely upgrade both my Windows desktop and laptop to Windows 8, when the consumer version is released, but I’m not that excited about it. Meanwhile, maybe I’ll play around with Ubuntu a bit more!