97 Followers
70 Following
Merle

Merle

Dataclysm by Christian Rudder

By Christian Rudder Dataclysm: Who We Are (When We Think No One's Looking) - Christian Rudder

On its face this book sounds good: data guru uses the information people share online, particularly on the dating website OkCupid, to reveal demographic trends. There is some interesting information here, along with fun graphs and charts. But while Rudder may be a good statistician, he’s a poor sociologist, and the book is riddled with eyebrow-raising assumptions and conclusions. It also hangs together poorly, jumping from one disconnected subject to another, with chapters that share a fairly simple finding padded by repetitive discussions of the author’s methods and rhapsodizing about the scope of his data. For a better book on what Big Data says about us, I recommend the more recent Everybody Lies.

Unfortunately, Rudder begins the book with random, skewed guessing. In describing OkCupid, he confidently asserts that “[t]onight, some thirty thousand couples will have their first date because of OkCupid. Roughly three thousand of them will end up together long-term. Two hundred of those will get married[.]” This caught my attention immediately: 10% of online first dates leading to long-term relationships is a fantastic success rate, but less than 7% of long-term relationships ending in marriage seems awfully low for the 20’s-and-up crowd. Curious what definition of “long-term” Rudder was using, I flipped to the notes at the back, only to find that he made it all up based on the fact that the site has 4 million active users and 300 couples per day reporting that they are leaving OkCupid because they found someone on the site. Plus his intuition that fewer than 1 in 10 long-term couples get married: “How many serious relationships did you have before you found the person you settled down with? I imagine the average number is roughly 10.” My own experience of the world is very different (I don’t think I know anyone who’s had 10+ long-term, serious relationships). And since the average American woman marries at 27 and man at 29, and according to the CDC, the average adult woman reports 4 lifetime sexual partners while the average man reports 6-7, Rudder’s impression seems the more likely to be skewed.

The author’s conclusions are equally questionable. He observes that men seem to find 20-year-old women the most attractive (at least on a site evidently without teenagers) throughout their lives, while women’s view of male attractiveness changes to accommodate their own age, and concludes that middle-aged men don’t contact young women for fear of rejection and social judgment. This overlooks the fact that there’s much more to a relationship than physical attractiveness; how many 50-year-old men want to live in a world of exam stress and frat parties, with a partner who has comparatively little life experience?

Another chapter seems to confuse correlation and causation. In “You’ve Gotta be the Glue,” Rudder explains that couples who each have multiple clusters of Facebook connections from different areas of their lives, and are the only person connected to each other’s various tribes, last longer than couples who are connected to all the same people, who all know each other. This makes sense: if you belong to several social groups (co-workers, college friends, book club, etc.) and your partner has gotten to know all of them, your relationship is well-established and likely serious. But if you belong to a tight-knit community and start dating someone within your group, your Facebook connections provide no indication of how serious you are. Rudder, however, interprets the data as proving causation, concluding that the “specialness” of the couple in being the “glue” between different social groups somehow boosts the relationship. He fails to explain how “connecting” his gaming buddies to his wife’s extended family strengthens their marriage – presumably if these social groups cared to mingle much, they’d befriend each other on Facebook and then what happens to the couple’s “specialness”?

When the book moves away from dating-related data, it becomes a series of disconnected one-off chapters. There’s a discourse about group rage on the Internet that involves little data analysis and seems to be included because the author is interested in group rage on the Internet. There’s a chapter about the language used in Twitter posts, concluding that Twitter definitely isn’t killing sophisticated thought because “a,” “and,” and “the” are among the top 10 words used in English both on Twitter and off of it. There’s an equation meant to demonstrate that multiplying a word’s frequency rank in a text by its number of uses will result in a constant, but the chart meant to illustrate this point with Ulysses displays a “constant” ranging from 20,000 to 29,055.

All that said, there is some interesting material here, particularly the data on race. The chapter on racist Google searches is less relevant now that the author of that study has written his own book (the aforementioned Everybody Lies); and Dataclysm, published in 2014, has a rosier view of this than the 2017, Trump-era version. But the study showing massive racial differences in how people rate one another’s attractiveness is still quite relevant: key findings include the fact that people tend to view members of their own race as more attractive than others, but black Americans take a major hit in the ratings from everybody (including other black people, though to a lesser degree). My first reaction on reading this was that it’s hard to judge people for preferring cultural commonalities in their most intimate relationships. But the data isn’t so simple: it’s based on how people rate a photo, not whom they choose to contact, and attractiveness doesn’t only affect one’s dating prospects, but employment too (there’s a chart on that). And in-group biases in American society are hardly limited to dating; while our neighborhoods, schools, workplaces, churches, and friend groups are still largely separate, I’m inclined to believe that Rudder’s data does show hidden bias.

Overall, while there are interesting nuggets in here, I wouldn’t recommend the book. A few interesting data points are padded into book-length by ill-conceived interpretations and rambling. By the end I was simply tired of it – the writing didn’t engage me when unaccompanied by charts, the book lacks cohesion and the author had lost far too much credibility. Try Everybody Lies instead.