the kdmcinfo weblog

GateHouse vs. The New York Times

In a case with far-reaching implications for the widespread practice of automated aggregation of headlines and ledes via RSS, GateHouse Media has, for the most part, won its case against the New York Times, who owns Boston.com, who in turn run a handful of community web sites. Those community sites were providing added value to their readers in the form of linked headlines, pointing to resources at community publications run by GateHouse. The practice of linked headline exchange is healthy for the web, useful for readers, and helpful for resource-starved community publications. However, for reasons that are still not clear (to me), GateHouse felt that the practice amounted to theft, even though the Boston.com sites were publishing the RSS feeds to begin with.

Trouble is, RSS feeds don't come with Terms of Use. Is a publicly available feed meant purely for consumption by an individual, and not by other sites? After all, the web site you're reading now is publicly available, but that doesn't mean you're free to reproduce it elsewhere. The common assumption is that a site wouldn't publish an RSS feed if it didn't want that feed to be re-used elsewhere. And that's the assumption GateHouse is challenging.

Let's be clear - this is not a scraping case (scraping is the process of writing tools to grab content from web pages automatically when an RSS feed is not available). Boston.com was simply utilizing the content GateHouse provided as a feed. I would agree that scraping is "theft-like" in a way that RSS is not, but that's not relevant here.

In a weird footnote to all of this, GateHouse initially claimed that Boston.com was trying to work around technical measures they had put in place to prevent copying of their material. Those "technical measures" amounted to JavaScript in its web pages, but boston.com was of course not scraping the site -- they were merely taking advantage of the RSS feeds freely provided by GateHouse. In other words, they were putting their "technical measures" in their web pages, not in their feed distribution mechanism, missing the point entirely.

GateHouse seems primarily concerned with the distinction between automated insertion of headlines and ledes (e.g. via RSS embeds) vs. the "human effort" required to quote a few grafs in a story body. Personally, I don't see how the two are materially different, or how one method would affect GateHouse publications more negatively or positively than the other. If anything, now that GateHouse has gotten its way, they're sure to receive less traffic.

The result is that Boston.com has been forced to stop using GateHouse RSS feeds to automatically populate community sites with local content. If cases like this hold sway, there will soon be a burden on every site interested in embedding external RSS feeds to find out whether it's OK with each publisher first.

PlagiarismToday sums up the case:

It was a compromise settlement, as most are, but one can not help but feel that GateHouse just managed to bully one of the largest and most prestigious new organizations in the world.

Also:

The frustrating thing about settlements, such as this one, is that they do not become case law and have no bearing on future cases. If and when this kind of dispute arises again, we will be starting over from square one.

I'm trying to figure out who benefits from this decision... and I honestly can't. GateHouse loses. Boston.com loses. Community web sites with limited resources lose. And readers lose. Something's rotten in the state of Denmark.