Saturday, October 28, 2006

A bug in Digg

I've been using digg a lot in the last few hours trying to generate interest in my last , designed for digg, post and I think I may have found a bug. I'm sure their are other ways of reporting bugs but it seems very much in the digg spirit to do so through a blog post/digg submission. Anyway, the bug seems to show when one tries to digg the "my number 1" story of a user they've just befriended. A message appears saying that you've already dugg the story regardless of whether this is true or not. If you then leave your new friends profile and go to the story's page itself digg will once again let you digg it. The critical point is that the bug is only apparent when digging from the page to which you are sent after initially befriending someone. Well there it is, I can now rest peacefully knowing that I've done my bit to make digg better.

Digg!

Top Ten Wired Software Stories of All Time

Well, here it is, my guaranteed to make it to the Digg front page post.

1. Scripting on the Lido Deck
Issue 8.10 | Oct 2000
By Steve Silberman

One luxury cruise ship. A hundred hard-coding geeks. Seven sleepless nights. Welcome to the floating conference called Perl Whirl 2000.

This was the article that inspired me, so it gets the top spot.

2. O, Engineers!
Issue 8.12 | Dec 2000
By Evan Ratliff

Twenty years ago, Tracy Kidder published the original nerd epic. The Soul of a New Machine made circuit boards seem cool and established a revolutionary notion: that there's art in the quest for the next big thing.

Kidder’s Soul of a New Machine is one of the best books written about the software/hardware development process. Reading it changed my world, at least for a little while. I was maybe 13 when I read Soul of a New Machine. After that, what had been a mild interest in computers turned into a passion. The engineers working at Data General were my heroes. I have a few books that I reread again and again and Kidder’s masterpiece is pretty close to the top of the list. It was also the first Pulitzer prize winner I read and I’ve been a faithful follower of the prize ever since.

3.The Java Saga
Issue 3.12 | Dec 1995
By David Bank

Sun's Java is the hottest thing on the Web since Netscape. Maybe hotter. But for all the buzz, Java nearly became a business-school case study in how a good product fails. The inside story of bringing Java to the market.

Java was among the first languages I tried as a young computer science student, and while I may have resented it at the time, the story of its creation is both fascinating and enlightening. Wired describes the story well and with a mild geek factor. In respect and remembrance of the dot com bubble I’ve tried to down play the business side of things in my choices but in some articles it still shines through.

4. Leader of the Free World
Issue 11.11 | November 2003
By Gary Rivlin

How Linus Torvalds became benevolent dictator of Planet Linux, the biggest collaborative project in history

I don’t know how you can have a list of inspirational tech stories without putting one about the little guy from Finland who changed the world in the top 5. If this were a list based entirely on inspiration I think this would be a contender for number 1.

5. Code Warrior
Issue 7.12 | Dec 1999
By Chip Bayers

Microsoft's head honcho for Windows 2000 seeks perfection. It's a lonely crusade.

O.K so rounding out the top five is an article about the great satan. Some may question how a story about Microsoft can be inspiring. I ask you to think back to the days before the antitrust case and before the browser wars. I remember reading Coupland's Microsurfs and thinking, perhaps wrongly, this is really cool. Bill Gates is a nerd that made it and we should respect him for that. It’s a good article about an interesting story. The geek factor is moderate, so it takes spot number 5.

6. The Quest for Meaning
Issue 8.02 | Feb 2000
By Steve Silberman

The world's smartest search engine took 250 years to build. Autonomy is here.

Having strayed a little from the computer science world, I’m shortly to become a librarian and as a librarian how can I not put a story about searching on the list. Making it even better is that it’s not about Google, every librarian’s worst nightmare. This also fits well with the next story on the list. Academia can be inspiring too.

7. The Cutting Edge
Issue 4.03 | Mar 1996
By Michael Meloan

In computer science at Stanford, academic research can be a battle - to the death.

It takes Wired’s impassioned writing to make computer science faculty interesting. None the less, it’s good to find the inspirational in stories that aren't about making money.

8.When We Were Young
Issue 6.09 | Sep 1998
David S. Bennahum

In the Golden Age of ASCII, kids could be king.

O.K so maybe I wasn’t alive in the era this article is about, but I wish I was. Some people dream of the good old days when there were no computers, because there were no computers. I know a librarian or two that fits this description. Others however, dream of those days because they wish they could witness the technology’s birth. I don’t long for the past because of safer streets, but because I want an Altair.

9.Open War
Issue 9.10 - Oct 2001
By Russ Mitchell

It started as a crusade for free source code. Linux zealots turned it into a full-frontal assault on Microsoft. Now the battle for the desktop could snatch defeat from the jaws of moral victory.

I think the flagship of the open-source era deserves to make it onto the list twice and heck, what’s more inspiring then a real life David and Goliath battle.

10.Meet the Bellbusters
Issue 9.11 - Nov 2001
By Steve Silberman

Network-geek power couple Judy Estrin and Bill Carrico helped build the Internet as we know it. Now they want to safeguard its soul.

Not so much about coding, but still an interesting story and certainly important to all those web 2.0 web service developers out there.

Digg!

This all started a few weeks ago when I began looking through old Wired articles for mentions of Marshal Mcluhan. I had been reading some of his older stuff out of interest and decided to turn it into a paper on new social technology and the institution of the library in North American society. In Wired’s early days they often published articles about their patron saint and I was curious what they had to say.

Anyway, as I was paging through these dusty volumes I recalled, and subsequently found, an article that had inspired me to such a degree, that after a year long hiatus from anything computer related, I jumped back in and started taking courses at the University of Toronto. It felt strange being the only political science major in a room full of math and computer science geeks, but it was a world which I had always longed to be part of. It took a couple of months but I got back into the swing of things and had a great time. I loved swiping my way into the computer lab or logging into the UofT machines via SSH. I got a great deal out of the experience and ironically bumped up my GPA.

I write this because I owe it all to the excellent writing and editing of the Wired staff and contributors. I have since met a number of others who, while not interested in computer science as a career or field of study, have benefited or could benefit from a similar experience. So, as a reward to those who have, and a source of inspiration to those who have not, I’ve compiled a list of the top ten best or most inspirational programing and software engineering articles from Wireds past.

I’ve done my best to look through as many of Wired featured articles as possible. However I may have missed some and would certainly appreciate suggestions. Just so I don’t get a host of emails from irate readers I’ll explain my criteria. Firstly, I looked for articles that spoke to the software development process, the development of some software/language in particular or a personality important in the software development world. I judged the articles based on their level of inspirational content, their level of detail and the degree to which they “got their geek on”. By the last criteria I mean the degree to which they got into the technical details.

Some may disagree with my rankings based on these criteria and to them I say, the overriding criteria for any top ten list is the personal preference of the reviewer. If I liked it, it made it on the list. Having said that, I still want input, both on the articles and on your own sources of written inspiration. The more comments the better.

Wednesday, October 25, 2006

First Google now Amazon; attacked from all sides

This is just a quick follow up to my FRBR post. Amazon it appears is leveraging their Mechanical Turk to steal more then just the jobs of catalogers. Their new service NowNow is essentually a reference question service the works on mobile devices such as cell phones and blackbarries. Unlike Google that uses trained and vetted reference librarians (or equivalent) or Yahoo that allows anyone, NowNow farms the questions out to Mechanical Turk. I'm not sure what the quality will be like as anyone can sign up to work for mechanical Turk. According to the FAQ a response rating system will regulate the system to some degree. I'm not sure what the pay structure is but it might be a quick way for an impoverished library student to earn a few bucks. I'm thinking that if described creatively it might also look interesting on a resume, particularly to some one not in the know.

Wednesday, October 18, 2006

FRBRising with the folks

During last semesters advanced cataloguing class I spent a great deal of time thinking about the relationship between traditional cataloguing and modern collective systems like "tagging". Eventually I began to think about FRBR and the changes coming out of OCLC. Particularly, I focused on the various attempts being made to identify the "work" under the new system. OCLC has begun to test catalogue FRBRisation with what they call the "work set" algorithm and have met with some success, but certainly far from 100%. The problem that I just don't see them getting around is that often the "work" is simply not represented in the traditional bibliographic record, not even as a combination of elements. If this is the case no amount of processing by computer or librarian will be able to accurately and consistently identify and group "works". What the FRBRisation process needs is just a little added information about each record. This seems like a perfect task for a social bookmarking application. I'm not suggesting that social bookmarking or tagging should replace the more traditional details of the cataloging process. Any information that already exists in the bibliographic record our can be found on the chief source of information should still be dealt with in the tradition fashion. However, the "work" to which an item belongs is neither currently found in any pre FRBR databases nor easily derived from the chief sources of information.

I do recognize that the devil is in the details with these things. The problem of multiple overlapping "works" would undoubtedly arise in the absence of a predefined set, among other problems. However, Google and others have had a great deal of success devising algorithms that use user input, but don't do so literally. User input is filtered and analyzed to identify trends, commonalities, etc. This is how the Google spell checker works, google suggests a word based on the aggregate misspellings of the searching community. The idea behind a social FRBRisation project would not be to let the community definitively define the work of an item like they might tag a picture on flickr, it would be to generate the information lacking in a traditional bibliographic record so, that something like the "work set" algorithm might perform more efficiently.

The other problem I foresee arising is that a library undertaking a FRBRisastion project might not have the time or resources to develop and shape the kind of community that would be require to pull something like this off. However, I believe this has a solution as well. I was recently listening to a podcast/interview with Jeff Bezos, the founder of Amazon.com. He was addressing the various new non-consumer products that Amazon has begun to offer. All the various services were interesting to here about but one caught my attention immediately, Mechanical Turk. I had remembered reading about it when Amazon first introduced it but hadn't paid any real attention to its development. The Mechanical Turk allows for organization to programmably farm out small tasks to large groups of independent contractors. Each task itself is worth only a few cents, but individuals who signup can, in theory, perform many tasks in a very short span of time, enough to make a reasonable sum of money. Bezos called it artificial artificial intelligence, because from the programmers point of view the Amazon computer is doing all the work. In reality the Amazon computer is asking a person and then sending the result back to the service subscribing third party. My point is that a library that wanted to FRBRise its database quickly could employ the Mechanical Turk instead of waiting to build it own community.

Interestingly Amazon developed the Mechanical Turk initially for internal use, to do much the same thing as I'm suggesting. Amazon had a problem with duplicate records. They realized that many products were virtually the same and could be sold/inventoried as a single product, but were in their database as two items. It was too large a problem to give to one, or even a group of people, so they created a task marketplace, which evolved into the Mechanical Turk. A program would identify similar records and then submit them to the market place as a task. All the Amazon employee had to do to earn a few extra bucks was glance at each record and answer yes or no to the program. If the answer was yes the records were merged, if no the program moved on. All I'm suggesting is that something like the "work set" algorithm replace the Amazon program. Sure it would cost, but looking at how things are priced, not as much as one might think.

Sunday, October 15, 2006

Digging for wiki's

Library Science Student and avid blogger Jason Hammond has an interesting post on his blog HeadTail regarding the similarities between Google and Wiki's. I bring it up not only because Jason always has interesting and insightful things to say, but also because he's set himself the challenge of beating my record 8 digg's on the social new bookmarking site digg.com. My digg got him to seven, so he's not far from leaving me in the dust. I encourage everyone not only to read his post but to give it a digg. I think it would be a first if we could get a Western LibSci blog to the digg front page. For those of you in a hurry here is the link to digg the story.

Wednesday, October 11, 2006

How much structure is to little structure?

I just finished reading Brian Lamb's article on wikis. Generally I was impressed, Lamb seemed to take a fairly critical view that I think was lacking in a number of the other articles. However, I do take issue with his brushing off of the structure issues that can arise when collaborating via a wiki. He seemed to suggest that user problems with wiki structure stemmed mainly from unfamiliarity rather then genuine issues with usability. Much of his argument in this regard rested on his faith in the inherent search functions of a wiki. This attitude seems counter to the library and information science perspective. I certainly don't want to bring up and go through the many arguments for proper cataloging and sort over search, but they do exist and are prevalent enough that they shouldn't be ignored.

Lamb also suggests that recent additions and update lists, as well as other assorted tools would somehow address the lack of structure issue and ultimatly improve findability. I couldn't disagree more. If anything tools such as the recent change list hinder findability by focusing user attention towards a small subset of the total information contained in the wiki. In terms of searching the recent change list is no different from a new book list in the library, while interesting and full of helpful information it is not a tool for finding things.

Contextual links also come with their own problems, particularly when combined with unsupervised editing. If the only structured means of finding a page is through a contextual link, all that is required to orphan the page is for the link to be edited out of the related article. Given the known problems with freetext searching a page that has lost its single contextual link may never be found.

None of this is to say that wikis are not a wonderful collaborative tool, but to ignore the issue of structure risks making a waste of a lot of positive effort. It would also seem that the inherent self organization of groups fades as the group grows increasingly diverse. As libraries of all sorts generally deal in diversity, the issue of structure, it would seem, should be even more relevant to them.

Vista's RSS platform podcast

Just thought I would post a link to an interesting podcast/interview with Amar Gandhi, a program manager on the IE7 team. It's about a year old but still very interesting. In it they talk about RSS integration with Windows Vista as well as issues related to potential uses of RSS and some potential pitfalls among other things. A word of warning, the first five minutes is very technical, after that it gets fairly easy to follow. http://weblog.infoworld.com/udell/gems/ju_ghandi.mp3

Sunday, October 01, 2006

Looking for delusions in google

Just a quick post for any who are interested. I found while searching for something completely unrelated that google print has made available the full text of the book "Extraordinary Popular Delusions & the Madness of Crowds" by Charles Mackay. The book is notable for a number of reasons ,not the least of which is its mention in James Surowiecki's book the "Wisdom of Crowds". Surowiecki, in part at least, structures his best selling book as a response to the incidents that Mackey recounts in his work. Both books are interesting reads and in many ways fundamental to the development of social software both inside and outside the library.

Digg!