geocoding experiments

I wrote an initial blog post on Gisgraphy about a month ago. I wanted to write a follow-up, but hadn’t gotten around to it, what with all the other stuff I have going on. But I’m going to take a few minutes now and write something up.

The initial import process to get the Gisgraphy server up and running took about 250 hours. The documentation said it would take around 40 hours, but of course there’s no way to be accurate about that kind of thing without knowing about the specific hardware of the server it’s being installed on, and the environment it’s in. I’m guessing that, if I had more experience with AWS/EC2, I would have been better able to configure a machine properly for this project.

Once the import was complete, I started experimenting with the geocoding web service. I quickly discovered something that I’d overlooked when I was first testing, against his hosted web service. The geocoding web service takes a free-form address string as a parameter. It’s not set up to accept the parts of the address (street, city, state, zip) separately. It runs that string through an address parser, and here’s where we hit a problem. The address parser, while “part of the gisgraphy project,” is not actually open source. An installed instance of Gisgraphy, by default, calls out to the author’s web service site to do the address parsing. And, if you call it too often, you get locked out. At which point, you have to talk about licensing the DLL or JAR for it, or paying for access to it via web service.

Technically, the geocoder will work without the address parser, but I found that it returns largely useless results without it. For instance, it will happily return a result in California, given an address in New Jersey. I’m not entirely sure how the internal logic works, but it appears to just be doing a text search when it’s trying to geocode an un-parsed address, merely returning a location with a similar street name, for instance, regardless of where in the US it is.

While I don’t think the author is purposely running a bait-and-switch, I also don’t think he’s clear enough about the fact that the address parser isn’t part of the open source project, and that the geocoder is fairly useless without it. So, we shut down the EC2 instance for this and moved on the other things.

Specifically, we moved on to MapQuest Open, which I was going to write up here in this post, but I need to head out to work now, so maybe another time.

2 Comments


  1. Hi andrew,

    i’m the project owner of gisgraphy. i have read your post very carefully. believe me, my goal is not to hide anything. I will update the documentation to tell it a better way and i will soon open an unlocked version of the address parser for user that install gisgraphy locally. The address parser is not opensource and because i sell license apart, i can not make it opensource.

    believe me my goal is not ‘bait-and-switch’ only the geocoding part use the parser all the other webservice are totally independant. if my goal was to make money, i think i would have choose some way that takes less time (i spent hours to develop, update docs, answer the forum,ets) and the free servers cost me money too, it is not an obligation to put a free server even if it is opensource.

    Best regards from france
    David

    Reply

  2. David –
    Thanks for the response. I hope you didn’t take offense at anything I said. Your project is really cool, and I wish it had worked out for us. It’s obvious that you’ve put a lot of time and effort into it!
    – Andy

    Reply

Leave a Reply