Getting Crafty With OSM Buildings

OpenEverythingMap

OpenStreetMap (OSM) is the largest open-source map in the world. As you may surmise from its name, it notably maps streets. But the database holds more than just streets. Oh my, yes.

Every line in OSM is still officially a “way”, a nod to the OSM’s British origins, and the more commonly British usage of “way” to mean “road,” from the Old English “weg” which meant literally “road” (in the old English usage of the word “literally” to literally mean “literally”). In the original definition on the OSM wiki, a “way” is “a street, footpath, railway line, etc.” But not all ways are ways anymore.

Over the 10-year history of OSM, a complex semantic ecosystem has emerged, repurposing the lines first used to draw streets to delineate everything else OSM volunteers wanted to draw. At this point, most of the “streets” of OpenStreetMap are no longer streets — they’re buildings.

Of all the mapped objects in OSM today, the most common is a building. Eyeballing the counts on taginfo.openstreetmap.org, building data outweighs roadway data by a 3:2 ratio.

There are now nearly 130 million buildings in OSM. Many of these have been entered by hand, traced from satellite imagery, but recent large-scale imports of open-source municipal and national databases have added many more, including a million from the NYC Building Footprints dataset, 18 million+ from the Dutch national Registry of Addresses and Buildings, and at least 35 million from the French equivalent of the IRS.

In an attempt to learn more about these buildings, the ways that they are stored in OSM, and the ways they might be useful (and also to contribute to our maps puzzle table at the Code for America Summit), I decided to try to export a building and make a papercraft out of it. This turned into two very different problems, united by one common fact: OSM doesn’t store buildings. Not really.

Flat Earth Society

So first, some mapsplaining. For a variety of reasons having to do with the nuclear strong force, the electromagnetic force, and gravity, maps are usually flat. Thus: OSM’s data is generally assumed to lie on the plane of the ground. So, buildings in OSM are mostly just, like, outlines of buildings, drawn on the ground.

To describe things that aren’t so flat, other kinds of OSM tags have come into use, such as the “height” tag, as well as the “min_height” tag which describes how far off the ground something starts, e.g. a specific floor of a building.

Mapzen’s Tangram WebGL map-drawing library uses this information to extrude these outlines upward, creating 3D objects which, at a distance, resemble buildings. I chose the Empire State Building because, Empire State Building. Also, it’s easily recognizable and mostly made of flat surfaces and right angles. (I also squashed it to fun size, to maximize the fun.)

vector data of the Empire State Building

In most cases, the result satisfies our needs… but not always. I mean — have you ever really looked at a building? I mean really?

Well, if you ever have, you may have noticed that some buildings have more than one level of roof. OSM in its benevolent wisdom has provided another tag for these situations called “part”, to allow a building to be made from more than one “part”, as it were. Still other “relation” objects can be used to group these parts together into one conceptual building.

Many buildings in OSM are organized in this way, especially famous ones, which more frequently come under the scrutiny of eager mappers, and thus of OSM’s hoary cadre of data-purity enforcers. But in many other less-notorious cases, the buildings elude the gaze of the hoary cadre, and the data is messy, disjointed, or otherwise not up to spec.

Wild Tile

Complicating this situation is the fact that generally, this data is not accessed directly from OSM in its pure, crystalline form. OSM is a useful storehouse, but it isn’t designed as a high-performance production database. So, like many tools which use OSM data, Tangram makes use of a tile server, which decants bulk-exported OSM vector data into small, quantized “vector tiles” for improved access and transfer performance.

a single exported tile

This means that to fetch a building, you must fetch the tile which holds the building — and very often, buildings span tiles. In Mapzen’s tile server, a building “part” is copied into each tile it touches, and different parts of a building may be stored on different tiles. This is especially common for buildings with complex roofs or multiple towers, such as churches, castles, or fancy palace-type things.

Tangram, in its quest for expediency, ignores all of this semantic intricacy and blithely redraws any duplicated parts; adjacent pieces are drawn with their various tiles, buildings spring forth apparently solid and unharmed, and no one is the wiser.

This deception can cause headaches at export time if you’re not expecting it, which is the first of the two sub-problems I mentioned earlier. But assume for now that something building-shaped has been extrapolated from OSM data. That’s it — keep assuming for a few minutes while I describe 3D file formats.

Object Oriented

To humans, 3D objects are beautiful, complex things capable of eliciting unbearably strong feelings of attraction or disgust. But to a computer, a 3D object is just a bunch of numbers, grouped into sets of three. Call them “points.” When two points love each other very much, they make a line. And where three or more lines are gathered together, there a face is in the midst of them.

There are lots of 3D file formats, but at the core they all group numbers into sets of three to describe points. Some formats then group points into lines and lines into faces, but they’re still mostly just lists of numbers, so they’re relatively easy to write out and move around.

Not coincidentally, when Tangram extrapolates 3D geometry from the OSM data, the result includes a list of points. In the current Tangram scheme, these lists are stored in JavaScript objects, one object per layer per tile.

The process for converting these objects to 3D files (and indeed, from one 3D format to another) is mostly regexes and string shuffling. (You can check out the details in this github repo: https://github.com/tangram-map/vbo-export/.) This process produces files which can be used in a variety of 3D apps, depending on the file type.

Once exported as an .obj and imported into Sketchup, I was able to trim the tile of everything that didn’t look like an Empire State Building.

And thence we come in our journey — having left the Headwaters of OSM and the Polygonal Badlands — to the next great Realm of Problem: the Vale of Tears. Paper tears. You see, because I was tearing up. Paper. Let me explain.

Papercrafty

Papercraft is an ignoble art, the inverse of origami: rather than finding common cause with paper’s whims and humours through the subtle art of folds and bends, the paper is violently coerced to assume a shape contrary to its nature through brute force of cutting, and restrained from obeying the compulsions of its disposition with chains of strongest glue.

Naturally, as in all other areas of paper coercion, the Japanese lead the way. There’s one app in particular (unfortunately not open-source, which kind of sags my soapbox) called Pepakura Designer. It costs $35, and is written for Windows (I used an emulator on my Mac), but there’s really no other sane way to do it. (I gave that freeware Linux utility far more than a fair shot. It does not work good.)

Pepakura enjoys a large fanbase which has produced an impressive body of tutorial and instructional content, so I will resist the temptation to add to it here, save for one detail: Pepakura, in order to discharge its duties, desires as its starting point a “solid,” which is a concept in 3D with a number of uses.

A “solid” is a three-dimensional topological polyhedron, or polytope. Put another way, it’s a three-dimensional orientable manifold with boundary. You will of course notice this implies that the Euler characteristic of the combinatorial boundary of such a polyhedron is 2. (For all you artsy types, this means the combinatorial manifold model of solidity guarantees the boundary of the solid separates space into exactly two components as a consequence of the Jordan-Brouwer theorem, thus eliminating sets with non-manifold neighborhoods.)

Put another way, a 3D solid is a single watertight piece, with no holes in the faces, intersections, subdivisions, or weird habits.

When Tangram builds a building, it extrudes the outline of each building or building part up to the appropriate height, then caps it with a roof face. Unless each part has a “min_height” tag, it is extruded from the ground up.

So our Tangram models are disqualified from solid status right away — they don’t have a base because you’d never see it, they’re full of extra faces, and they can have lots of intersections. They are far from watertight, and their Euler characteristics are hardly worth mentioning.

Our Tangram export function, so far, is perfectly happy to replicate this state of affairs, and the 3D file formats are (mostly) okay with it too. But this is not Pepakura-worthy.

Luckily, in a past life I was a professional 3D model pesterer, so I poked and tweaked and simplified the model until Pepakura accepted it as its own, and worked its mysterious magic, which is the true papercraft.

And it looks kind of neat! Very Empire State Building-like.

completed papercraft building

Conclusion

OpenStreetMap has lots and lots and lots of buildings. Most of them are either not so interesting or too complex for papercraft, and the vast majority don’t even have a “height” tag. Still, exportable buildings could be interesting or useful when imported into 3D apps for custom rendering or printed in some way.

Regardless, our exporter doesn’t currently immediately produce papercraftable models. And Pepakura isn’t the only app which prefers solids to degenerate monster polys — many 3D printing services including Shapeways like their models solid too. Extrusion printers such as the MakerBot may have an easier time with unmodified exported data, depending on the slicer software used to prepare the models, but it would be nice to make the data more useful in more situations.

So we plan to modify the exporter so that it produces less wacky geometry, and perhaps eventually add a step to combine overlapping shapes to create a single solid, which will make everyone happy at once.

Parting Gifts

Here’s a link to the exporter in its current stand-alone form: https://github.com/tangram-map/vbo-export

Here’s a link to the papercraft chibi-buildings I made, in three trendy color schemes based on a few of Tangram’s procedural textures: https://github.com/tangram-map/vbo-export/tree/master/ESB_layouts

completed papercraft building

(If you attempt one of these, I recommend the also-not-open-source Scotch® Adhesive Dot Roller.)

And to wrap up, here are some famous buildings, as interpreted by OpenStreetMap, to remind us all that there’s still a lot of work to do.

some 3D buildings from OSM

Answers: Chrysler Building, St. Basil’s Cathedral, Colosseum, Eiffel Tower.