cURLing with Alfresco’s and Google’s Data APIs

Jeff Potts: Curl up with a good web script (interacting with Alfresco’s Document Manager via CMIS and Atom)

Google Data APIs: Using cURL to interact with Google Data services

Bonus: commandlinefu.com: Update twitter via curl as Function

Dare Obasanjo: “Don’t fight the Web, embrace it”

A must read: Dare Obasanjo: Explaining REST to Damien Katz:

There are other practical things to be mindful of as well to ensure that your service is being a good participant in the Web ecosystem. These include using GET instead of POST when retrieving a resource and properly utilizing the caching related headers as needed (If-Modified-Since/Last-Modified, If-None-Match/ETag, Cache-Control), learning to utilize HTTP status codes correctly (i.e. errors shouldn’t return HTTP 200 OK), keeping your design stateless to enable it to scale more cheaply and so on. The increased costs, scalability concerns and complexity that developers face when they ignore these principles is captured in blog posts and articles all over the Web such as Session State is Evil and Cache SOAP services on the client side. You don’t have to look hard to find them. What most developers don’t realize is that the problems they are facing are because they aren’t keeping RESTful principles in mind.

“Global naming leads to global network effects.”

First, a reminder about what makes the Web, the Web….

W3C.org: Architecture of the World Wide Web, Volume One: 2. Identification:

In order to communicate internally, a community agrees (to a reasonable extent) on a set of terms and their meanings. One goal of the Web, since its inception, has been to build a global community in which any party can share information with any other party. To achieve this goal, the Web makes use of a single global identification system: the URI. URIs are a cornerstone of Web architecture, providing identification that is common across the Web. The global scope of URIs promotes large-scale “network effects”: the value of an identifier increases the more it is used consistently (for example, the more it is used in hypertext links (§4.4)).

Principle: Global Identifiers

Global naming leads to global network effects.

This principle dates back at least as far as Douglas Engelbart’s seminal work on open hypertext systems; see section Every Object Addressable in [Eng90].

What are the global – public – URI’s of Facebook? What are they in regards to any social network for that matter?

This is an important train of thought to consider when debating how Facebook and other social networks influence our relationship with Google, and the entire Web.

Facebook’s growth devalues Google’s utility – it devalues the public Web – at least how it is described in “Small Pieces Loosely Joined” and the Web’s own architecture document.

This is why Scoble can’t be more wrong when he says “Why Mahalo, TechMeme, and Facebook are going to kick Google’s butt in four years” because Facebook and other social networks are going to not only affect how we use Google – but will eliminate the utility of the Mahalo’s and TechMeme’s of the world – because they too rely on a robust and growing *public* URI ecosystem.

Dare: Why Google Should be Scared of Facebook:

What Jason and Jeff are inadvertantly pointing out is that once you join Facebook, you immediately start getting less value out of Google’s search engine. This is a problem that Google cannot let continue indefinitely if they plan to stay relevant as the Web’s #1 search engine.

What is also interesting is that thanks to efforts of Google employees like Mark Lucovsky, I can use Google search from within Facebook but without divine intervention I can’t get Facebook content from Google’s search engine. If I was an exec at Google, I’d worry a lot more about the growing trend of users creating Web content where it cannot be accessed by Google than all the “me too” efforts coming out of competitors like Microsoft and Yahoo!.

The way you get disrupted is by focusing on competitors who are just like you instead of actually watching the marketplace. I wonder how Google will react when they eventually realize how deep this problem runs?

None of this invalidates Scott Karp’s riff on Scoble’s main point – there is a growing role for “Trusted Human Editors In Filtering The Web”. Our friends, our families, our communities. Not just machines and algorithms.

My favorite and fellow bloggers, Slashdot, Salon, the home page of the NYTimes, Philly Future, Shelley Powers, Scott himself, my news reader subscriptions, are all trusted humans, or representations of trusted humans, filtering the Web for me.

There’s nothing new to that fact that people play a direct role in how we discover what may interest us on the Web. It goes back to Yahoo!’s earliest days. Back to links.net, back to the NCSA What’s New page. It goes to the heart of what blogging is all about.

People have been way too hung up on Digg’s voting algorithms and forget that what makes Digg, Digg is its community of participants.

People forget Slashdot outright. As they do Metafilter.

So it still comes down to trust – What organizations do we trust? What systems do we trust? What communities do we trust? What people do we trust?

And just how do we share that with each other?

Adrian Holovaty: “Newspapers need to stop the story-centric worldview”

Adrian Holovaty: A fundamental way newspaper sites need to change:

This is a subtle problem, and therein lies the rub. In my experience, when I’ve tried to explain the error of storing everything as a news article, journalists don’t immediately understand why it is bad. To them, a publishing system is just a means to an end: getting information out to the public. They want it to be as fast and streamlined as possible to take information batch X and put it on Web site Y. The goal isn’t to have clean data — it’s to publish data quickly, with bonus points for a nice user interface.

But the goal for me, a data person focused more on the long term, is to store information in the most valuable format possible. The problem is particularly frustrating to explain because it’s not necessarily obvious; if you store everything on your Web site as a news article, the Web site is not necessarily hard to use. Rather, it’s a problem of lost opportunity. If all of your information is stored in the same “news article” bucket, you can’t easily pull out just the crimes and plot them on a map of the city. You can’t easily grab the events to create an event calendar. You end up settling on the least common denominator: a Web site that knows how to display one type of content, a big blob of text. That Web site cannot do the cool things that readers are beginning to expect.

I left a comment responding to a poster saying this sounded like the Semantic Web, I’ve been meaning to write Adrian for a while now as well:

Hello Adrian,

I’ve been meaning to say hello to you for a number of different reasons over the past few years.

I’m an old Knight Ridder Digital developer. One of the folks that helped develop the Cofax CMS that was later replaced by KRD with… something else.

Cofax was a framework as well as a CMS, and in some very positive ways (well *I* think so :)), Django reminds me of it. Cofax was open sourced, but when KRD replaced it, well, work pretty much kept me from going back, refactoring, and taking it where it could still go. It’s still in use in many places. Well enough of that…

I definitively agree with you that newspapers are terrific places to work if you are a software engineer. The pace is quick, the work challenging, and you get the rare opportunity to not only practice your profession, but do so building tools and services that connect, inform and empower people.

It’s hard to beat.

anonymous – yes, I think Adrian is talking Semantic Web here. But like Adrian’s call for newspaper organizations to take a hard look at how they manage information in their publishing systems, Tim Berners-Lee has made the same call to the web developer community. The hard sell has been that that the Semantic Web likewise solves a series of problems of lost opportunity. It requires an investment in time and effort by the developer community to see its potential archived. Adrian, please correct me if that’s an incorrect understanding on my part.

Great piece.

Related reading material: Aaron Swartz: “The Semantic Web In Breadth” and Shelley Powers: “The Bottoms Up RDF Tutorial”. Then there’s “Practical RDF” also by Shelley Powers (which I ummm need to get around to reading, but have always heard good things about).

More at Techdirt.

Full feeds versus partial feeds

Lots of folks out there take a hard line when it comes to publishing either full feeds (the entire contents of each post being published in RSS/Atom) or partial feeds.

Scoble, for example, is famous for declaring he won’t subscribe to anyone’s partial feed.

Shelley and Rafe have posted thoughtful takes on this, from either side of the fence.

My take? Well I publish a full feed. But for the longest time I didn’t. It hasn’t made a difference as far as my readership is concerned one way, or another, because this is such a personal space for me.

‘There is more than one way to do it’ should not only be the motto of Perl, but the motto of the web. There is room for both approaches – and many more. We’ve mostly gotten each other speaking the same language (hey I know that’s arguable), but to argue that there is only ‘one true way’ to publish the sentences misses the beauty of the web.