Search This Blog

Tuesday, December 11, 2018

A Bikes Legally Faster or Just Faster (and What Difference Does it Make)?

As a bicyclist who rides for both recreation and commuting, articles about bike-vs-car speeds catch my attention. Here's one backed up by data! Yes, actual data (because, you know, data matters).

However, I have questions about the data. From my own bike commuting experience, I fully accept that a bike can be as fast as a car. However, there are only two ways that a bicycle can actually be faster than a car.

  1. The bike has access to different routes than cars. There could be bike paths that cut through parks, across rivers, etc. that cars can't drive on. This allows bikes to bypass congested streets and intersections. There could also be dedicated bike lanes along the roads that give bicycles different legal rights. I experienced this on a recent visit to Washington DC. There was a path near our hotel that went past the airport, through a park, and over the river with no stop signs. It was shorter and faster than the best car route.
  2. The bike uses the same routes as cars but passes on the right at every traffic signal. In heavily congested traffic, this allows bikes to get ahead of the cars. When traffic is stopping every few blocks, this can create a significant time advantage for bikes over cars. I've also experienced this in a variety of cities.

So here's the problem.

Under #1 above, too many cyclists think that sidewalks are legitimate routes for bikes. In most cities, that's illegal. It's also dangerous when there's pedestrian traffic. To clarify, I'm talking about actual sidewalks, not multi-user paths. Legally, those are two different things.

Under #2, in the absence of dedicated bike lanes, passing on the right is usually illegal and dangerous.

So when "Data From Millions of Smartphone Journeys Proves Cyclist Faster" I'd like to know what percent of those "millions" of journeys were completely legal. Based on my personal experience biking and talking to other bikers, I'm suspicious but I don't have data. Even if I had their data, I doubt that it would show whether or not the biker or cars broke any laws.

That leaves me at an impasse. Without data, I can't find evidence to support or contradict my suspicion.

That brings me to the second part of my title: does it matter?

Let's assume that I'm right and a large proportion of the "bikes are faster" data is from illegal riding. If the law is rarely, or never, enforced then is it really a law? The last time I saw an officer stop a bicycle for riding on a sidewalk was when Barney Fife stopped the spoiled kid in downtown Mayberry. I've never seen a bike stopped for passing on the right. If our culture accepts this sort of biking, then maybe it doesn't matter if it's technically illegal and it's fair to simply say "bikes are faster".

On the other hand, what if you're the insurance company that provides workman's compensation or liability coverage for Deliveroo? If a delivery rider gets injured or causes injury, then your company's financial responsibility could change if the rider was breaking the law. Even if "bikes are faster", you might want to encourage the use of cars if bikes are more likely to break the law.

I would argue that the legality issue does matter. Now how do we get the data?

Friday, December 7, 2018

Data Access Versus Data Privacy: US Census

I came across an interesting TheUpShot article in a post from FlowingData. It seems that analysts have found ways to pull individual data records out of aggregated Census data that is supposed to protect our privacy.

This problem shouldn't be a surprise to anyone who has spent time working with Census data. We use census data extensively in my introductory statistics class. One semester the class spotted a divorced 13-year-old female in our sample. The sample included her county and state.

Another time, we uncovered a 42-year-old female lawyer whose fourth marriage was within the last year. Again, we had county and state information.

We didn't try to figure out who they were but we talked about whether or not we could figure it out. it depends on where they lived. Had both of them lived in Los Angeles County California (population 10.2 million) it would have been difficult. Had they lived in Langlade County Wisconsin (population 19,000) it wouldn't have been very hard to go through public marriage and divorce records to find them.

On the negative side, it's amazing how little statistical ability someone needed to spot these opportunities for bypassing privacy.

On the positive side, these two had to stand out from the rest of the data in order to get noticed and most of us don't stand out.

However, a little more statistical ability and few more variables changes what it takes to "stand out". Maybe you and I are more unusual than we think we are and, therefore, we're easier to identify. That's what the researchers in the linked article claim.

Before you panic and refuse to participate in any future Census, read the article. The Census Bureau is aware of the problem and they're working on it. That's both good and bad. It's good that the Census Bureau takes our individual privacy seriously, but it's bad that the solution might be intentionally screwing up the data.

One solution is virtually moving people (a nice way of saying "falsify the data"). Maybe the 42-year-old lawyer doesn't actually live where the data says she lives. Is it OK to swap her with another woman in a different census block (the smallest geographic unit)?  If that's OK, is it OK to swap her county? How about her state?

It depends on your level of geographic analysis. Counties and states are often poor units (see MAUP). Census tracts or blocks are more useful.

As a researcher, I want the best data I can get and Census data is considered the gold standard of publically available data (trust me, private companies know a lot more about you). On the other hand, I value data privacy.

I'm glad that the Census Bureau has to solve this instead of me.

Tuesday, November 13, 2018

If you doubt that data matters...

What was Amazon looking for with their widely publicized search for an HQ2 location?

One could answer by looking at their list: proximity to international airport, mass transit, regional population, ...

However, some are saying that Amazon was actually looking for DATA. Here are two articles:

Neither article is very long so you should read them yourself, but I'll provide a couple of interesting quotes:

From Bloomberg "But it kept hundreds of millions of dollars worth of free information from the cities to create the biggest corporate site location database in the world, according to Richard Florida, an urban studies professor at the University of Toronto."

From Reason: "Amazon is now privy to information about where different municipalities are going to direct investment and infrastructure in the near future. The company can exploit this information. ...  Maybe Amazon just happens to purchase a new fulfillment center right around a soon-to-be-developed locale which would see increased demand for Amazon products. Maybe it simply decides to squat on land for a while, knowing that it will soon be smack dab in a hive of activity. A new brick-and-mortar store? They'll have the option. Or maybe knowing where news roads will be built will make it easier for Amazon to plan transit routes. There's profit to be extracted from this data that you and I could not even conceive."

Whether Amazon played a game just to obtain data or the data is a side benefit of an honest search, it's clear that data matters.

By the way - while not the same level and volume of data that Amazon got, ALL of us have access to a great deal of government data for free. Check out IPUMS.

Tuesday, October 16, 2018

Demographics is Destiny?

I just came across this link on cities and "peak Millennial" from posts by Digging Data.

We seem to like making broad generalizations when comparing generations. I usually cringe when I hear them because there's rarely much data behind the statements.

In particular, I'm getting tired of hearing how Millenials are so different from any prior generation. I don't see much of it. They show up in my classes. Some are smarter and some not so much. Some are lazy. Some are industrious. Some are liberal, some are conservative, and some don't even think about politics. I could go on but, essentially, they aren't that different from the students who came before them (or the ones before that or the ones before ...).

Still, there is data to support some generalizations about them. As post-college young adults, they - on average - seem more drawn to urban environments. Data supports that. However, some then predicted that their generation would completely revive the urban landscape for decades.

Maybe not. The article linked above says that, even though Millenials are marrying and having children later in life (which is data supported), their housing and community preferences for the married-with-children stage of life might not be all that different from their predecessors:

"But with a view of history and demographics, it’s not difficult to imagine a future where that love [of city life] fades with the years, and a different sort of life starts to seem appealing. Millennials have shown a tendency to delay marriage and children, and thus occupy their studio apartments in urban cores for longer. But that’s no reason not to be concerned that school quality and more space might factor into their choices as they age."