Archive for the 'Google' Category

Uberpower by Josef Joffe

Tuesday, September 5th, 2006

Uberpower contains 7 chapters: 2 that describe the United States’ status as the last remaining superpower, 2 about the rise of sentiments for and against the US, and 3 that describe possible solutions. The first two chapters are informative, and then everything goes to hell in the second pair of chapters, leaving the remaining 3 chapters suspect.

I went over the edge in chapter 4, “The Rise of Americanism.” Joffe claims that there has recently been a surge in anti-Americanism, driven by a surge in Americanism. He uses Google searches for “anti-Americanism 2004″ to measure the “sudden surge” in anti-Americanism. He writes:

When “anti-Americanism” followed by a particular year was entered, there were 180,000 entries for 2004. . . . For the 1970s, the average was 12,000, as it was for the 1960s and in the year 1950. (p. 97)

But if we search Google for just “2004,” we get 7,970,000,000 results, but only 284,000,000 for “1950.” This means that if we normalize Joffe’s results to account for the number of pages Google indexes containing the years 2004 and 1950, the phrase anti-Americanism is less commonly used with 2004 than 1950. Joffe says that his survey is “only suggestive.” Even if we could be sure that webpage mentions corresponded to public opinion, which is doubtful, his survey would be “suggestive” of the opposite of what he concludes. After reading this section, the rest of his conclusions were suspect.

Later in the book, Joffe quotes John Winthrop as referring to America in a sermon as a “cittie upon a hill with the eies of all people upon them” (p. 108). Later, he quotes Winthrop again, but this time he writes “the eyes of all people uppon us” (pp. 240-241). Is he making this up? Throwing in funny spellings to liven things up? Is he this careless with the rest of his quotations?
Talking about the spread of American clothing styles around the world, Joffe writes, “Among the even younger set, the bulky pants of street surfboarders became de rigeur almost instantly . . .” (p. 99). I believe that I am one of the street surfboarders, a member of the even younger set that Joffe mentions, but I do not consider anything de rigeur. I get the feeling that Joffe experiences youth culture like my parents: “Let’s all wear novelty t-shirts to the teen center! Don’t wear that ballcap backwards– it’s disrespectful!”

In the concluding chapter of the book, Joffe summarizes his recommendation for US policy (which was the reason I wanted to read the book– how do we get ourselves out of this mess?). He writes, “Briefly: the United States will have to balance and to bond in order to extend its lease on the top floor of international politics” (p. 210). Ignoring the metaphor blighting the end of the sentence (I know, you Thomas-Friedman-reading Americans don’t care), that’s an interesting thesis, if you’ve read chapters 5 and 6 where Joffe summarizes what he means by balancing and bonding. Unfortunately, the rest of his book undermines his reliability.
I freely admit that I don’t know much about, for example, the military history of Europe; I have read zero works by Carl von Clausewitz, and Joffe’s summary of the downfall of Bismarck was the most I’ve read on the topic. This is why I read books by experts in fields with which I’m unfamiliar– I’m hoping to learn from their wisdom. My fields of expertise (spelling, Google, high school statistics, street surfboarding) and Joffe’s don’t overlap very much, but where they do, Joffe looks terrible.

I guess I’ll stick with Wikipedia– the authors might not all have Ph. D.’s, but at least they know how to use Google.

Book review: Who Controls the Internet? by Goldsmith and Wu

Saturday, April 22nd, 2006

Longtime readers of this site will be loath to discover another tedious book review of a soft technical book, but perhaps there are a few new readers who have some life left yet. Read on, unknowing vanguard of the newly alienated!

Jack Goldsmith and Tim Wu, both law professors, wrote Who Controls the Internet: Illusions of a Borderless World (Oxford University Press, 226 pp., $28.00) to present their answer to the question of “whether the technological changes of the last decade . . . have had a lasting effect on how nations, and their peoples, govern themselves” (p. 180). They make a compelling case that the optimistic hope that the internet would erase national boundaries has been replaced by a reality of local control leveraged through governmental pressure on intermediaries, at least in the case of large multinational companies.

Goldsmith and Wu cite the changing attitude of Yahoo’s Jerry Yang between mid-2000 and the fall of 2005. In 2000, Yang was defiant toward French regulations that prohibit the sale of Nazi goods: “Asking us to filter access to our sites according to the nationality of web surfers is very naive” (p. 6). Five years later, Yahoo, in compliance with Chinese law, collaborated with the Chinese government to identify a dissident journalist, Shi Tao, through his Yahoo email account. Said Yang, “To be doing business in China, or anywhere else in the world, we have to comply with local law” (p. 10). The information that prompted Chinese authorities to jail Tao, like the Nazi paraphrenalia offered for auction in 2000, was found on one of Yahoo’s servers in the United States. The situations are not exactly analogous, but Yang’s change in attitude is what’s important.

The enforcement pattern illustrated by Goldsmith and Wu is one in which local authorities pressure local intermediaries, generally ISPs, to control the content flowing over their wires. Search engines are also targeted, with Google.fr and Google.de cited as examples. Additionally, large multinational companies doing business over the internet need local law enforcement on their side. The capture of criminals is enacted as some local police officer pointing a gun at a would-be h4X0r and asking him or her to come quietly (as normal people are asleep at this hour). This is the major point that Goldsmith and Wu get right.
I’m not convinced by their thesis relating to free speech laws racing to match the lowest common denominator– I think that this is exactly what is occurring, but they disagree. In describing a case involving Australian libel laws and a wealthy Australian named Joseph Gutnick, Goldsmith and Wu suggest that wsj.com can choose not do business in Australia if they don’t want to be subject to Australian laws. “Compliance with Australian libel laws . . . is a cost of doing business in Australia” (p. 157).

Goldsmith and Wu assume that publishing on the internet works like publishing newspapers. In traditional publishing, newspapers read in Australia are printed and distributed in Australia; the printers and distributors are work in Australia. In internet publishing, documents are stored on a central server, and copies are transmitted to whichever computers request them. The server is maintained, at least in part, by people who work in the country where the server is hosted. Where a website does business is determined only by the consumers whose interests lead them to request the site using their web browser.

I could imagine a restricted case of the argument, in which we limit ourselves to the consideration of only subscription sites where local caches are used. If wsj.com maintained a server in Australia and charged subscribers to access it, that would be a compelling argument, but this does not describe the vast majority of the traffic on the internet. Most sites do not require subscription, and while many high-bandwidth publishers do maintain caches around the globe, most domains reside on only one server in one country.

Is it really nothing new that multi-national corporations need to either limit their speech to everyone or pay for communications filtering (in the form of IP geolocation)? In the case where people pay for permission to see content, as with the bulk of wsj.com, there is a business relationship formed. But consider a Z-list blogger such as myself who might decide to stain his soul by displaying Google ads for high-value keywords (exchange-traded funds, for example). When Goldsmith’s Australian cousin who likes reading the Z-list finds this review on Google and feels that I have besmirched his family name by disagreeing with his lawyer cousin, do I need to invest in geolocation filtering? Or will Google.com.au just do it for me? Preemptively, perhaps?

As one might expect from Oxford University Press, the mechanics of the book (rectangular pages, clear type) are well executed, save one typo in the last paragraph on page 160: missing period (or spurious capitalization) between “costly” and “But.” The cover, with its ghostly green pixel trails and monospaced subtitle, uses the imagery of the The Matrix, or was it the cover to Stephen Levy’s Crypto, published in 2000? Goldsmith and Wu’s book is printed on acid-free paper, but not on recycled paper. Crashing the Gate, by Jerome Armstrong and Markos Moulitsas Zuniga, which was published just a few days earlier, is of similar cost ($0.128/page vs. $0.124/page for Goldsmith and Wu), and target market, but it’s printed on “100 percent post-consumer-waste recycled” paper. Perhaps Oxford was concentrating its efforts on the editing, rather than the physical production of the book.

Unfortunately, the editing of the book could be better. We can be certain of this because an “adapted” subset of the book appeared in the January/February 2006 issue of Legal Affairs, and it was better. The improvement I noticed first was that the book’s strange assertion that “Apple came in Icelandic” (p. 50), which mistakes the company for its product, appeared as “Apple software came in Icelandic” in Legal Affairs (p. 42).

In general, the prose in Legal Affairs flows more smoothly and tells a more precise story. Consider these two sentences from the book: “Net geo-identification services are still relatively new but are starting to have their effects on e-commerce. Online fraud, and in particular online identity theft, has been a big challenge for e-commerce, causing firms and consumers to lose billions of dollars each year” (p. 61).

The corresponding sentences in Legal Affairs read: “Internet geo-identification services are still nascent, but they are starting to have effects on e-commerce. Online identity theft in the U.S. causes firms and consumers to lose billions of dollars each year.”

That’s a decrease of more than 20% in number of words for the same content and a 50% reduction in usage of the word “e-commerce.” Note also the substitution of “nascent” for “relatively new” and the omission of “their” as a descriptor of “effects” in the first sentence. It’s unfortunate that the more permanent version (the book) couldn’t be the better version.

Who Controls the Internet nails the most important points of support for its thesis. I would have liked to see more information about the rise of ICANN and the history of the organizations that preceded it. The central argument of the book convinced me that the idea that the decentralized architecture of the internet would ensure decentralized control of the internet was wrong; as Goldsmith and Wu claim, we are at “the beginning of a technological version of the cold war” (p. 184).

Chrislott.org is smarter than I am

Saturday, February 18th, 2006

Chris Lott has a great post responding to my last post about this gatekeeper business.

As I commented on his blog:

Well said, Chris Lott.org! I know you used my blog post as a example of disagreement, but you have stated my central point quite well: “The issue here is that many in this debate use the term ‘Gatekeeper’ when what they mean is ‘powerful connector’ or something of that ilk.”

I’m not disputing the shape of the long tail or the influence of Doc’s link love– the question is just who, in your metaphor, is driving the Ferrari.

I claim that the drivers are Dave Sifry of Technorati and Page and Brin of Google, not Doc Searls. I also agree with your description of the desire behind the argument, the desire to have Searls (for example) “recognize the influential power they have and the amount to which they tend to link inside their circle and that other worthy people don’t get that link love.”

I was just drinking some cocoa at the old Darwin’s with the translation-obsessed SJ Klein, and we talked over the gatekeeper business. I suggested an idea I had recently:a Wordpress plugin that presents your blogroll in reverse order of Technorati rank. SJ suggested an even more amusing possibility– sorting by reverse Technorati authority.

When this gets written, you’ll see the news here first (unless someone else writes it).

Gatekeepers and you: the exciting third post

Saturday, February 18th, 2006

Seth Finkelstein and an unknown vendor of Algerian scarves responded to my last post with a few counter-arguments.
Seth claims that my comparison of the Boston Globe’s letters section in 1995 to blogs today “exemplifies a tendency to talk-down all the avenues that do exist, but we know are ineffective in practice (going around to various other publications), and talk-up an avenue that’s favored, but also seems ineffective overall (random related Google searches).”

I agree that I was talking down the 1995 alternatives to the Globe, like the Herald and the Phoenix. However, I was doing that because I thought that using them would be, as Seth says, “ineffective in practice.” I have actually tried doing things like that (sending letters to the Phoenix about the Globe’s poor reporting), and it was definitely ineffective. The other possibility is that I’m a ranting madman who deserves his place on the tip of the long tail, but if so, I submit that all of my arguments are always right because, hey, I’m a ranting madman.

Also, Seth is right that I was talking up Google, but I don’t think that Google is “ineffective overall.” I’m not arguing that Google is perfect. I’d like to say, “Here, look at how short this list of ‘things I can’t find on Google’ is,” but I don’t know how to generate that list. All I can say is that I regularly find my searching needs satisfied by Google. I would definitely be interested in hearing of counter-examples, though. (I’m not talking about net censorship here– a crippled Google is obviously less effective. What I mean is something like, “Here’s this brilliant analysis of X that Google could index, but due to reason Y, you can’t find it.”)

Seth goes on to say “That Doc Searls is a gatekeeper is shown unarguably by the fact that so many people talked about and linked to my post after he was kind enough to put it through his blog-gate.” I would say that it is shown arguably, rather than unarguably, and here is my argument. The letters editor for the Boston Globe is a gatekeeper– we all agree about that. He or she decides which letters get published, in the same way that an actual gatekeeper decides which Algerian scarf vendors get let through the gate to the castle and which are prodded with spears until they retreat.

Doc Searls, on the other hand, flies from conference to conference and writes about things that interest him. Google indexes his pages, and as a result of the link structure of the web and Google’s PageRank algorithm, pages that he links to end up higher in the Google search results. Doc Searls isn’t actively deciding who gets on the first page of Google. If Google changed their algorithm to use BrinRank, in which pages are sorted by length and links are ignored, then Doc Searls would do exactly the same thing, and entirely different pages would get the top results on Google. If Searls is the gatekeeper, rather than *Rank, what’s going on?

The second commenter, Monsieur Lheureux, characterizes Doc Searls as having the ability to drown me out in the Google listings, making my post “effectively inaccessible.” Unfortunately, this brings me back to the Boston Globe letter and 1995 again. When the Globe decides not to publish my letter, it is inaccessible. Nobody can ever get it, not even me. That’s different from being the 7 billionth result on Google. My blog post is still accessible from the internet, and I can tell everyone I communicate with how to get there.

I think the real complaint here should be about Google’s algorithm. To some extent, it is unreasonable to complain about a search engine’s algorithm. It’s like complaining about bad commercials on TV. Their intent is to make money, and they’ve figured out a good way to do it. I’m not saying that making money is an excuse for immorality, but I don’t think Google has a moral responsibility to popularize Z-listers such as myself.

One last note: anyone have any good suggestions for how PageRank could be improved? Simply ignoring links doesn’t work so well (Remember Yahoo in 1997? It sucked.). Anyone?

Doc Searls is not a gatekeeper.

Friday, February 17th, 2006

I know, I know, it’s sooo February 11th to discuss the gatekeeper issue, but Z-listers such as myself don’t spend all their time blogging.

Seth Finkelstein responded to my Stephen Kurkjian example my last post with this point, “You only have one such result because nobody with higher ‘gatekeeperness’ wants them - not because of any great ability to reply.”

This highlights the point at which Seth and I disagree. He’s right that were someone with a highly popular blog (Doc Searls, for example) to start blogging extensively about Kurkjian, my result would soon be bumped down the list of Google results into oblivion. However, I don’t think “gatekeeper” is the right name for that situation. Doc Searls’ intent would probably not be to drown me out– he’s just adding his statements to the pile of available material on Kurkjian.

Viewed through the lens of Google, the effect is similar. Viewed from a perspective of information propagation, Doc Searls is just the reverse of a gatekeeper– he can’t blog about something without propagating the ideas that he mentions. Take Scoble’s brrreeeport meme. Nobody, not even Scoble himself, could stop that once it was started.

Compare that to the example I used in my last post of me sending a letter to the Boston Globe in 1995. The Globe acted like a gatekeeper. Some editor there decided not to publish my letter. As far as I know, that left the ocean of Globe readers exactly zero ways to find other public responses to the news. My best bet might have been standing in the middle of the Boston Common with a sign promoting my cause. Even then, there is no search engine that indexes placards found in public spaces.

That’s why I claim that Doc Searls isn’t a gatekeeper. On the other hand, I strongly agree with Seth’s criticism of the recent Technorati authority feature. I sent the following feedback to Technorati:

“Your new authority filter feature is a bad idea. It lets you find that which is already easy to find, while obscuring that which is already obscure. It might be a useful tool for establishing what blogs are popular, but popularity is very different from authority.”

“Perhaps you could call it the “mainstream” filter. It’s more accurate but less appealing than “authority.” Unfortunately, that’s what it really is.”

I think that it’s the algorithms of the Technoratis and Googles that are the real gatekeepers. So then the question is what is a good substitute for variations on “most links” or “most readers”?

Suggestion to Doc Searls: remove every “A-lister” from your blogroll. Divest your A-list links today!

Robert Kuttner, Google, and cookies

Saturday, December 3rd, 2005

Robert Kuttner needs to learn more about computers before he spouts off in a prominent newspaper about how they work. It is true that Google uses cookies, but they are not Google’s “whole business model”. Their business model is based on the idea that someone searching for cheese is likely to be interested in an ad for cheese. It is not based on cookies.

Kuttner writes, “Herewith an idea that I am putting into the public domain, which could make some computer-whiz a billionaire: One of Google’s competitors could guarantee users of its search engines that all data keeping track of searches will be permanently discarded after 24 hours.” I don’t think anyone will become a billionaire off this one, herewith, or anywhere elsewith. As anyone who can open “Preferences” on their browser can learn, you can refuse to allow cookies to be stored on your computer. Furthermore, even if Google does record your search history when you let them, you can defeat this through even a weak attempt at anonymization. We don’t need a competitor to Google to do this. We just need to turn off cookies.

If you turn off cookies, it is unlikely that Google could track anyone behind a firewall doing network address translation. At Chewonki, for example, Google couldn’t tell the difference between me and anyone else with cookies turned off. They could tell that we were coming through an ISP in Portland, Maine. And that’s without even trying to hide. Anyone who really wanted to hide could use a web proxy, or an onion router like Tor. It would be slower, but what kind of complaint is that? “I want anonymous browsing with a lightning-like search engine now!” Stop whingeing!

On top of all that, Google is a private company providing a useful service in exchange for a chance at your attention. Robert Kuttner is 100% free to not use Google, ever. Near the end of his article, Kuttman mentions (but does not link to, so I used Google to find it) Robin Sloan’s dystopian vision of the future, where “it’s almost impossible to differentiate journalism from junk.” I think that time has already arrived, and Kuttner is the one blurring the distinction.

The real concerns I have about Google are how they will be forced to invade my privacy by the US government. I think it would be more fair to call that a concern about the US government, rather than about Google. For example, in their privacy policy, Google makes a distinction between “personal information” and “sensitive information,” where the former is your name and email address, the latter your politics, race, sexual orientation, and so on. Google says that it will not record or use that sensitive information for purposes other than the purposes listed in their policy, which are basically running their services and targeting ads at you. That seems reasonable to me. They don’t need to connect, for example, your political affiliation to your social security number and address, to target ads at you.

Google knows that if they are careless with their users’ private information, they will be rejected in the marketplace, like Gator was. If Google screws up by releasing private info when they shouldn’t, a Google anonymizer will spring up in a week. (A Google search reveals that in fact, it already has.) I’d love to see laws that codify levels of personal information that web services record and force companies to comply with them. Truste sort of does this already, but they only enforce the display of privacy policies, not actual levels of privacy.

If we had a few well-known levels, I could decide easily “Oh, this is a Level X site, which means that they will destroy their records of my personal information within 1 day of me receiving whatever I buy from them. OK, I’m willing to business like that. But I don’t want to do business with that Level Y site even if they are cheaper– they keep my information forever and sell it to spammers on a monthly rotation.” That’s what people should be writing and worrying about; unlike Google’s cookies, our federal laws cannot be turned off.