The rise of the journalist-programmer

I’d call it some long-awaited recognition for many. Gawker: Hack to Hacker: Rise of the Journalist-Programmer.

Hmm… have I qualified as a Programmer-Journalist in the past?

Use FeedBurner to update Twitter from your blog

Instructions provided by Matt Cutts in “Doing the “Digital Cleanse”: no Twitter for a week”.

It’s a time saver.

Two links on simple map visualizations with Python

Simon Wilson: Exploring Python: Stack Overflow Dev Days Amsterdam November 2009

FlowingData: How to Make a US Country Thematic Map Using Free Tools

Pros and cons for NoSQL

Pros:

a tornado of razorblades: SQL Databases Don’t Scale (Hacker News thread)

Cons:

Code Monkeyism: The dark side of NoSQL (Hacker News thread)

Related:

Archives of the Caml mailing list: Message from Brian Hurt

Chris Williams , Co-Curator of NoSQL East, NoSQL: A Modest Proposal

Carsonified: Should you go Beyond Relational Databases?

The UNIX Way

Kas Thomas of CMS Watch riffs on “The UNIX Way”, principals summarized by Mike Gancarz:

1. Small is beautiful.
2. Make each program do one thing well.
3. Build a prototype as soon as possible.
4. Choose portability over efficiency.
5. Store data in flat text files.
6. Use software leverage to your advantage.
7. Use shell scripts to increase leverage and portability.
8. Avoid captive user interfaces.
9. Make every program a filter

Read the whole piece.

cURLing with Alfresco’s and Google’s Data APIs

Jeff Potts: Curl up with a good web script (interacting with Alfresco’s Document Manager via CMIS and Atom)

Google Data APIs: Using cURL to interact with Google Data services

Bonus: commandlinefu.com: Update twitter via curl as Function

Hive, Hadoop at Facebook, Yahoo

Engineering@Facebook: Hive – A Petabyte Scale Data Warehouse using Hadoop

Yahoo! Developer Blog: Announcing the Yahoo! Distribution of Hadoop

Reading up on ETL (Extract, Transform, Load) processing

Wikipedia: Extract, transform, load

Wikipedia: Talend Open Studio

Talend Open Studio: Tutorials

Manageability: Open Source ETL (Extraction, Transform, Load) Written in Java

richard.gluga.com: Data Migration Done Right

kJube: Vendors and tools – ETL

AlfrescoForge: ETL Connector

Talend job for Job Scheduler implement

High Scalability: How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data

NYTimes: Announcing the Map/Reduce Toolkit

core-user@hadoop.apache.org: Andreas Kostyrka: Re: hadoop in the ETL process
Re: hadoop in the ETL process

Smart aggregation and API use in NPRbackstory

NPRbackstory is an automated Twitter feed that attempts to add context to the news stories trending popular today according to Google’s Hot Trends. It leverages NPR’s archives (very smart, as Joshua Benton notes archives are underused assets), and Yahoo! Pipes to produce a RSS feed that is fed into the NPRbackstory account. It was developed by Keith Hopper of NPR’s Public Interactive group.

Read Joshua Benton’s piece at Nieman Journalism Lab

Read more about it at Keith Hopper’s blog.

Check out his other Twitter related project – Twitterstars – a tool to find local Twitter power tweeters.

Smart, useful desktop mashup of transit data for Philadelphians

Check out fellow Comcaster Mat Schaffer’s Mac Dashcode widget, “iSepta Train View”. As the name suggests, it mashes up data from the fantastic iSepta.org with Septa’s own Train View for a concise look into Septa’s regional rail status.