The open data movement was rocked Tuesday by news that a well-known activist was arrested on federal hacking charges for downloading large chunks of data from a subscription academic database. Aaron Swartz, co-founder of Condé Nast's Reddit, pleaded not guilty to all counts at an arraignment in Boston this morning, Ryan Singel of Wired--also owned by Condé Nast--reports, and is free on a $100,000 bail.
If convicted, Swartz faces up to 35 years in prison and a $1 million fine.
The story is a tangled one. Swartz is known for published works analyzing large numbers of of academic publications, and JSTOR, the non-profit database service of academic publications Swartz allegedly hacked, released a statement saying that while it was cooperating with the investigation, it was not behind the charges.
The question is whether Swartz's actions should properly be called hacking, and what, precisely, Swartz planned to do with all those academic papers.
Swartz, at 24, is a well-known quantity in the tech world. He helped author the first version of RSS at age 14, and was an early partner Reddit, the user generated news service.
And Swartz has long extolled the virtues of open data. The Times reports that he published an unsigned manifesto about the need for data to be free in 2008 and he previously tangled with the Feds over the extent to which he mined data from a trial run of opening up the federal court records database. (The details of how the Swartz managed to pull the reported 4 million documents from JSTOR are fascinating--Ars Technica has a good explainer.)
The prosecutor in the case emphasized the hacking-as-theft angle in the press release accompanying the indictment. "Stealing is stealing whether you use a computer command or a crowbar, and whether you take documents, data or dollars," U.S. District Attorney Carmen Ortiz said.
The case is attracting quite a bit of attention, not only for Swartz's pedigree, but, as tech writer Nancy Scola notes, because open data has become an cause celeb in certain Washington circles. "It's easy to forget that there's something at all controversial or oppositional about accessing information, or that some people really, really want data to be free -- and others don't. Open data has been mainstreamed."
Some have taken the opportunity to highlight the stranglehold that services such as JSTOR put on academic publishing. Privacy researcher Christopher Soghoian speculated on Twitter, "How many of the academic papers that [Swartz] is accused of 'stealing' are hidden behind paywalls against wishes of the academic authors." And software engineer Kevin Webb has found himself in similar situations. "Aaron's arrest should be a wake up call to universities," Webb wrote. "Evidence of how fundamentally broken this core piece of their architecture remains, despite decades of almost unimaginable progress in advancing communication and collaboration."
Whether Swartz's actions will ultimately further that cause, whatever his motivation, remains a question. "Ultimately, it's not clear what the point was," Timothy B. Lee wrote on Forbes.com. "Even if Swartz had obtained the entire JSTOR archive and released a copy of it onto the darknet, it's not clear how that would meaningfully advance the goals of the open access movement."
In the meantime, Demand Progress, the progressive civil liberties nonprofit Swartz founded, are hosting a petition of support. Swartz's court date is Sept. 9.