Archive for December, 2005

Stephen Kurkjian: admit your error

Wednesday, December 28th, 2005

Stephen Kurkjian, a writer for the Boston Globe, reported today on the resignation of Massachusetts CIO Peter Quinn. Kurkjian failed to mention that his poor reporting one of the causes of Quinn’s resignation.

The background to Quinn’s resignation involves his controversial initiative that will require that all Massachusetts government computer systems store documents in OpenDocument format by January 1st, 2007. Microsoft, the major software supplier to the government, would naturally prefer that Massachusetts mandate the use of their new non-standard XML formats. The alternative, OpenDocument, is a standard already used by software shipping today; additionally, it is approved by an international standards body, OASIS, and has been submitted to the ISO.

In the past few months, Microsoft has been trying to argue against the new policy. In the middle of all this, Stephen Kurkjian, a veteran reporter at the Boston Globe, wrote an article entitled: “Romney administration reviewing trips made by technology chief.” The article alleges that Peter Quinn made sponsored trips to technology conferences without filling out the correct forms. The “review” was instigated by the Globe, as described in this quote from Kurkjian’s article: “The state launched its inquiry after the Globe began asking questions about the trips earlier this week; it is being conducted by Thomas H. Trimarco, the head of Administration and Finance.”

That was on November 26, 2005. About two weeks later (December 10, 2005), the Globe admitted that, in fact, Peter Quinn had done nothing wrong. Specifically, Kurkjian writes that, “[Quinn’s boss at the time, Eric Kriss] confirmed that he had verbally approved all of Quinn’s requests to travel to conferences in 2005. Kriss said he relieved Quinn of the responsibility of filling out the forms for the trips this year because he felt that the reason that the regulation had been put in place originally — the fiscal crisis of the mid-1990s had cut out all state-funded travel — had expired.”

Two more weeks pass, and on December 24, 2005, Quinn sends an email to his staff announcing his resignation. According to a report from Robert McMillan of Macworld, his email included the following: “‘Over the last several months, we have been through some very difficult and tumultuous times . . . Many of these events have been very disruptive and harmful to my personal well being, my family and many of my closest friends.’”

In his article today (December 28, 2005), Kurkjian quotes the same phrase, “some very difficult and tumultuous times.” Kurkjian’s next line is: “Quinn had been the subject of a review by his current boss, Administration and Finance Secretary Thomas H. Trimarco, following a report in November that Quinn had failed to fill out the required state forms to allow his appearances at numerous out-of-state conventions in 2005, where his visits were, for the most part, paid for by convention organizers. Trimarco’s review found that Quinn had authorization to make the trips and had not violated any conflict of interest provisions.”

It’s shocking that Stephen Kurkjian, while explaining that Quinn was quitting because of the stress of recent events, fails to mention that he was personally the cause of one the most significant events. Kurkjian mentions “a report in November that Quinn had failed to fill out the required state forms.” This fails to acknowledge that the report was a newspaper article, not a formal report of any sort; that the report was wrong; and that the report was written by Stephen Kurkjian, the same guy now reporting on Quinn’s resignation.

It’s certainly possible that most of the stress that pushed Quinn to quit came from other sources– for example, the testimony of Microsoft’s Alan Yates, or the strange misunderstandings of Representative Pacheco. However, given that Kurkjian wrote the erroneous report, he should take responsibility for his error.

Daniel Brandt and Wikipedia’s reliability

Friday, December 16th, 2005

Daniel Terdiman has posted an interesting interview with Daniel Brandt about his efforts to locate Brian Chase, the man who inserted false information into John Seigenthaler’s biography on Wikipedia. I thought that a remark Daniel Brandt made in the comments attached to the interview was interesting. Talking about tracking down the Chase, he said: ” Yes, I got lucky because there was a server on that address. I have never claimed that I was anything but lucky. Non-technical journalists frequently see the Internet infrastructure as some sort of black box, and ‘cyber-sleuth’ makes better copy than ‘I got lucky.’ That’s not my fault.”

That’s an example of a journalist spreading misinformation about him. Brandt has no recourse, and he doesn’t care. He doesn’t care because it’s a minor distortion, not libelous, but it’s strange that he offers that defense for himself but agrees with John Seigenthaler’s righteous fury. Couldn’t Seigenthaler also say “That’s not my fault”?

I think taking Wikipedia to task for permitting libel despite their best efforts is silly. Wikipedia is about as dangerous as me writing “Special Lecture at Harvard, 7 pm, December 22nd. Professor Henderson will discuss John Seigenthaler involvement in planning the Kennedy assassination,” on a piece of paper and posting it in Harvard Square. The levels of traffic and exposure are about the same. It’s true that Wikipedia in the aggregate gets lots of traffic, but not Seigenthaler’s page. If Seigenthaler’s page were a high traffic page, the error would get caught sooner. The police are not fact-checking flyers in Harvard Square, so why police Wikipedia? If the Harvard Square example is ridiculous, then pick any other forum or mailing list on the internet.

Since I was in 3rd grade, people have been telling me “Don’t believe everything you read!” and “Don’t believe everything you see on TV!” Those are educational cliches. When Nature looks at Wikipedia vs. Britannica statistically, Wikipedia does pretty well. So what’s the problem?

Note: Terdiman’s article bears the ZDNet tag “open source.” Why? Is Daniel Terdiman tagging his own articles about Wikipedia with the tag “open source,” even though he wrote an article last week arguing that the term “open source” was inappropriate for Wikipedia?

RSS aggregators

Friday, December 16th, 2005

I don’t understand why RSS aggregators don’t suck. I’ve tried lots of different RSS aggregators over the past two years or so, and as far as I can tell, the basic effect is to take all the news websites and blogs that I visit, strip out the graphical elements, and cram them into a paned interface reminiscent of Outlook. Why is this better than tabs with the sites open in them?

RSS + SSE and Greasemonkey turns the web into Wikipedia

Friday, December 16th, 2005

I’ve been thinking about the SSE extension and how it might be combined with Greasemonkey to fix a lot of shyste. Here’s an example: you’re reading a webpage, and you see a spelling error. You highlight the word, right click on some icon in the status bar, and select “spellcheck” or something like that. The point is that somehow, a Greasemonkey script records your correction, and whenever you visit that URL, it applies your correction.

Then, on a periodic basis, your browser contacts a website (THIS IS WHERE RAILS WOULD BE USED!) and updates an RSS + SSE feed of your corrections to the web (er, “the living web”). These feeds get aggregated, so as I’m browsing, I have a cache of Greasemonkey scripts shared and maintained in concert with people that I trust through RSS + SSE. The spelling example is minor– it could be expanded to be website commentary, adblocking (ooh! controversial!), or whatever. In fact, it has the potential to make the entire web as unreliable as Wikipedia!

I suspect that there could be scalability problems with this idea. There could also be stupidity problems.

Citizen journalists vs. the Boston Globe

Saturday, December 10th, 2005

The Boston Globe, a subsidiary of the New York Times, published a shameful article by Stephen Kurkjian about Massachusetts CIO Peter Quinn failing to provide detailed estimates about trips he took to conferences in 2004. The smoking gun of the article was: “He provided the name of the conferences he was attending, but only the total amount of money that the trip cost on three of them . . .” Before publication, the Globe failed to contact Quinn’s supervisor at the time, former Secretary of Administration and Finance Eric Kriss.

Now, 14 days later, Kurkjian has published an article that starts, “[Peter Quinn] did not violate conflict-of-interest standards or other rules when he took 12 out-of-state trips to attend conferences during the past year without obtaining the written approval of his boss . . .”

This time, Kurkjian managed to get in touch with Eric Kriss: ‘’’I knew of every trip that Peter was taking, and I approved them all,’ Kriss said.”

All of this occurred as Quinn was (and still is) involved in a struggle to move Massachusetts toward the use of OpenDocument, a file format for electronic documents. When a journalist goes off looking for dirt on a politician and prints a pile of unsubstantiated allegations, that’s called a witchhunt. Stephen Kurkjian, what were you thinking?

That brings us to the difference between citizen journalists, like me or the writers for Wikinews, and professionals like Stephen Kurkjian. When I talk about blogging with people who read paper newspapers on a regular basis, the objection I hear most often is: “How do I know that I can trust them? They could be anyone!” With me and the amateurs at Wikinews, you can’t trust us beyond record that we’ve established on the web. However, bloggers (like me, or my friend Mike at The Unauthorized Participant) have no incentive to lie to you. If I don’t feel like writing about anything, then I don’t. I go downstairs and read a book, and that’s it. Mike has a little incentive, as he has ads on his blog, but I suspect he makes about $0.03 per year from them.

The same is true of Wikinews. Last spring, I was unemployed for about 2 months. I wrote a lot of stories for Wikinews, and it was satisfying. Now, I have a new job and less free time. The OpenDocument story has caught my attention. I live in Massachusetts, and I want people to know that the state is on the verge of adopting an open standard for office documents. That’s the entirety of my agenda. I don’t work for Sun or IBM or anyone else who stands to gain from the adoption of OpenDocument. I work doing IT support for an environmental foundation, and I would like to end the scheme of forced upgrades in which Microsoft makes us upgrade to new versions of Office when all we need are the features that were available in Word 97. I want the truth to be told because I think it will make the world a better place.

Stephen Kurkjian, a professional journalist, has a different structure of rewards. The New York Times Company pays him to research and report on local news. Lots of the stuff he’s written has been great (notably, his work in Boston’s Catholic church abuse scandal in 2002). Other stuff, like the recent article trying to taint Peter Quinn without sufficient evidence, is embarrassing. The crux is that Kurkjian can’t go downstairs and read a book. If he wants to continue in his job, he has to write engaging content for the people of Boston to read, even if there is no story to be told. That’s why I am at least as skeptical of professional journalists as the amateurs.

Alright, this is getting dull. I’m going to go read a book (Thomas P. M. Barnett’s Blueprint for Action, if you were wondering). (Look out! He’s a blogger!)

Why wait to see whether history will repeat itself with Microsoft and Office 12 formats?

Saturday, December 10th, 2005

David Berlind has posted another great summary of recent events involving Microsoft’s Office 12 formats, OpenDocument, ECMA, and the ISO. The quick summary is that Microsoft has submitted a plan to get ECMA approval of their formats for Office 12. ECMA would then submit the standard to the ISO for approval. Microsoft has said that they will not sue anyone for infringing their patents in the course of implementing the Office 12 file formats. There is some concern about the possibility of Microsoft suing people who offer only partial implementations of the formats.

In response to one of the comments on his blog, Berlind writes: “You could say “history will repeat itself.” But then again, Microsoft’s covenant not to sue was a pretty big change for the company. So, if it can make that change, can’t it also break from its past in other ways?”

I think Microsoft might be genuinely changing. On the other hand, maybe they’re not. What is the incentive for the residents of Massachusetts, such as me, to give Microsoft the benefit of the doubt? OpenDocument is already approved by OASIS, and it’s likely that it will pass through the ISO before Microsoft’s formats. OpenDocument is available for use in multiple products, right now.

So why should I want my government to wait? Will Microsoft’s formats provide some technical advantage? We’re talking about document formats for an office suite. Will Microsoft’s format allow me to write a memo automatically laced with top-notch witticisms? Enhanced support for embedding physical objects in documents? Will it be the Web 2.0 of file formats? Can I tag their documents and batch upload them to Flickr with the click of a button?

If Microsoft’s intentions are actually good– if they aren’t just trying to hold on to their lock on formats for as long as they can– why won’t they support OpenDocument? The argument for Microsoft centers on their formats being just as good as OpenDocument. If they’re just as good, then let’s pick OpenDocument and avoid the risk that Microsoft may not have our best interests at heart.

I’d love to hear a rebuttal to my argument on technical grounds. The only comparison of the two formats I know of is on Groklaw, but I didn’t think the article was so good. It didn’t give me a good understanding the relative benefits of what the authors call “mixed” and “non-mixed” models for XML. It did make the good point that OpenDocument reuses other existing standards like SVG, while Microsoft does not.

(I’ve been waiting for widespread SVG support for years; I can’t believe that we still use the gd library for the creation of graphics programmatically when we could use SVG.)

“Open source” is not a metaphor

Friday, December 9th, 2005

Daniel Terdiman at CNet has an article up today claiming that “the open source label doesn’t really fit Wikipedia.” He’s right about that, but his explanation is wrong.

Here’s what Terdiman says about open source: “‘Open source,’ at least the way it’s been used in tech circles over the years, usually connotes successful, volunteer projects like the Linux operating system, which has strict controls and is monitored by a handful of people who make the call on what is handed over to the public.”

This is not the way “open source” is used in tech circles, unless one of the people standing in the circle is someone who just wandered over from the people-who-misinterpret-the-literal-as-the-metaphorical circle.

“Open source” is not a metaphor. “Source” refers to the code written by programmers and used to by compilers to create binaries, the files that computers execute. “Open” means that the source is available to the person running the binary. In the words of dictionary.com, “open” can mean “available.” That’s the way the term is actually used in tech circles.

Terdiman is right that Linux is open source. You can test this claim by downloading the source to the 2.6.0 Linux kernel. You’ll get a compressed file, and if you open it, you’ll find source code inside. If you find a precompiled operating system kernel inside, then you will have proved me wrong. That’s all that you need to do. Whether Linux is a “successful volunteer project” or a nascent alternative to Windows backed by IBM, HP, and Sun does not affect whether it is open source or not. Whether the source is open or closed has nothing to do with whether the project has “strict controls” or not.

Calling Wikipedia open source doesn’t make any sense. There is no “source” other than the content of the encyclopedia itself. When you edit Wikipedia, you are editing the current version of the encyclopedia. You are not editing some “source,” that later gets compiled into a binary that is executed by a computer. (Yes, there is formatting markup that you can add to the text, but nobody is calling Wikipedia open source because you can change the formatting.)

The software that runs Wikipedia, called Mediawiki, is actually open source. As with the Linux kernel, you can prove this to yourself by downloading the source to the latest version, 1.5.3. If you find a web-based encyclopedia inside when you expand this compressed file, then I am lying.

I think that Daniel Terdiman should seriously consider the possibility that he has been propagating a misunderstanding about what “open source” means. He notes that the founder of Wikipedia, Jimmy Wales, “doesn’t even like to call it ‘open source.’” That’s not because Jimmy Wales is worried that he’s not as strict in his controls as Linux founder Linus Torvalds. It’s because the term is wrong.

There are other terms that can be used to describe Wikipedia that are not wrong. “Collaborative” is pretty good. Later on, Terdiman calls Wikipedia: “grand and very subjective experiment in collective writing.” That’s sounds good too.

I suspect that what’s actually happened is that Terdiman has been misled by other confused journalists. For example, a few days ago, Katherine Seelye of the New York Times had an article in the International Herald Tribune titled: “Wikipedia: Open-source, and open to abuse.” However, her article didn’t actually use the phrase “Open source.” Through Google, I also found an article titled: “Wikipedia’s Open-Source Licks Open Wound” by Jaime Gottlieb that recounts the recent scandals with John Siegenthaler (former opinions editor for the fact-rich paper USA Today) and Adam Curry. Gottlieb’s article appears on 925M.com, “an online advertising community.” (I don’t know what that is.)

If anyone has counterexamples of “open source” being used to mean “collaboratively edited” or in some other metaphorical manner, I’d love to see them. In the meantime, I suggest Terdiman retitle his article, “Journalist notices discord from misuse of common technical term.”

Robert Kuttner, Google, and cookies

Saturday, December 3rd, 2005

Robert Kuttner needs to learn more about computers before he spouts off in a prominent newspaper about how they work. It is true that Google uses cookies, but they are not Google’s “whole business model”. Their business model is based on the idea that someone searching for cheese is likely to be interested in an ad for cheese. It is not based on cookies.

Kuttner writes, “Herewith an idea that I am putting into the public domain, which could make some computer-whiz a billionaire: One of Google’s competitors could guarantee users of its search engines that all data keeping track of searches will be permanently discarded after 24 hours.” I don’t think anyone will become a billionaire off this one, herewith, or anywhere elsewith. As anyone who can open “Preferences” on their browser can learn, you can refuse to allow cookies to be stored on your computer. Furthermore, even if Google does record your search history when you let them, you can defeat this through even a weak attempt at anonymization. We don’t need a competitor to Google to do this. We just need to turn off cookies.

If you turn off cookies, it is unlikely that Google could track anyone behind a firewall doing network address translation. At Chewonki, for example, Google couldn’t tell the difference between me and anyone else with cookies turned off. They could tell that we were coming through an ISP in Portland, Maine. And that’s without even trying to hide. Anyone who really wanted to hide could use a web proxy, or an onion router like Tor. It would be slower, but what kind of complaint is that? “I want anonymous browsing with a lightning-like search engine now!” Stop whingeing!

On top of all that, Google is a private company providing a useful service in exchange for a chance at your attention. Robert Kuttner is 100% free to not use Google, ever. Near the end of his article, Kuttman mentions (but does not link to, so I used Google to find it) Robin Sloan’s dystopian vision of the future, where “it’s almost impossible to differentiate journalism from junk.” I think that time has already arrived, and Kuttner is the one blurring the distinction.

The real concerns I have about Google are how they will be forced to invade my privacy by the US government. I think it would be more fair to call that a concern about the US government, rather than about Google. For example, in their privacy policy, Google makes a distinction between “personal information” and “sensitive information,” where the former is your name and email address, the latter your politics, race, sexual orientation, and so on. Google says that it will not record or use that sensitive information for purposes other than the purposes listed in their policy, which are basically running their services and targeting ads at you. That seems reasonable to me. They don’t need to connect, for example, your political affiliation to your social security number and address, to target ads at you.

Google knows that if they are careless with their users’ private information, they will be rejected in the marketplace, like Gator was. If Google screws up by releasing private info when they shouldn’t, a Google anonymizer will spring up in a week. (A Google search reveals that in fact, it already has.) I’d love to see laws that codify levels of personal information that web services record and force companies to comply with them. Truste sort of does this already, but they only enforce the display of privacy policies, not actual levels of privacy.

If we had a few well-known levels, I could decide easily “Oh, this is a Level X site, which means that they will destroy their records of my personal information within 1 day of me receiving whatever I buy from them. OK, I’m willing to business like that. But I don’t want to do business with that Level Y site even if they are cheaper– they keep my information forever and sell it to spammers on a monthly rotation.” That’s what people should be writing and worrying about; unlike Google’s cookies, our federal laws cannot be turned off.