The Scoop

  • Home
  • Projects
  • About The Scoop
  • Fixing Journalism
  • Departments
    • Apple
    • Asides
    • Broadcast
    • Campaign Finance
    • Car Tools
    • Data
    • DIY
    • django
    • Fed Data
    • FOIA
    • General
    • IRE
    • Journalism
    • Local Data
    • Mapping
    • Miscellany
    • NonGov Data
    • Online
    • Paper Trail
    • Presentations
    • Public Records
    • Python
    • Rails
    • SLA
    • Social Network Analysis
    • Sports
    • State Data
    • Teaching
    • Work
    • XML
  • Subscribe via RSS

Outsourcing Database Development, or the Caspio Issue

September 7th, 2007  |  Published in Data, Journalism  |  20 Comments

Updated: Caspio’s David Milliron responds in the comments.

The good news is that there are plenty more databases served up on newspaper Web sites than there used to be. Some papers are organizing entire desks around data. The bad news - and people can disagree on this - is that in some cases, the papers aren’t really doing much in the way of Web database development. They’re outsourcing much of the work to Caspio and its Bridge application.

This can’t be such a bad thing, right? I mean, more databases online is a good thing, and of all people, I should be encouraging any way to get the stuff up. Unfortunately, it’s not that simple. By leaving the work to Caspio and reducing database development to the safety of point and click, news organizations are far more likely to end up with a bunch of cookie-cutter apps that go just far enough to satisfy the “hey, we need some databases” crowd but not nearly far enough to hold the attention of readers and provide a real service.

Give Caspio this much: it spotted an opportunity in an industry that’s trying to do more on the Web with less, and the company has managed to sign up clients including the Indianapolis Star, the Arizona Daily Star, the Palm Beach Post and the Atlanta Journal-Constitution. The pitch is attractive: Web databases in hours, not weeks, and all you have to do is load the data and decide what how to display it. Indeed, Caspio boasts “No programming skills required” to use it. (I should say that I have not seen Caspio Bridge in person and have only had it described to me.)

So what’s the harm? I see several. First, Caspio’s product is an abstraction, built atop a database server (SQL Server, in this case) without giving users all of the power of that DBMS. It can’t, because in order to make things simple, it has to limit the ability of users. That translates into, for example, two choices for storing text: a maximum 255-character VARCHAR field and a maximum 4000-character VARCHAR field, even though SQL Server supports more. When you have a text field that always will be, say, 5 characters or less, it makes no sense to use a 255-character field. A small app won’t be affected in most cases, but larger ones could see a performance hit, especially on searches using wildcards. In addition, Caspio says it supports importing from text, MS Access databases and something called Caspio Bridge XML, which means that “only XML files previously generated by Caspio Bridge can be imported. XML files from other sources are not supported.” It also says, with no apparent irony, that “Text files are the least-desired import file formats because they cannot contain field data type information. If possible, import your data in one of the other formats. Caspio Bridge assigns a Text (4000) data type to all fields in tables imported as text files, unless you choose to appended (sic) or replace an existing table.” So it creates excessively large fields when importing text files, can’t deal with most XML and encourages the use of Access, which in practice means that very large datasets are going to be fun to deal with. Oh, and your organization (or Web site) doesn’t run on Windows? You may be out of luck.

Second, the data isn’t on your servers. I’m not sure what news organization would put a critical and exclusive data application on servers not under its control. That leaves room for plenty of potentially interesting data apps, like one charting Cleveland Browns’ games since 1946 from The Plain Dealer, but the lack of flexibility and control involved is ultimately going to be frustrating. If the PD wants to expand the Browns app, it has to add data to its database on Caspio’s servers. Can one Caspio app use data from another? It’s not clear from the docs. What about serving data to other types of apps like Flash? Nope. Want to use your Caspio app to build a Google Maps mashup? That’ll be an extra charge.

Third, Caspio makes it sound like programmers and developers only make things worse. They play on the ignorance of people who don’t know what good programmers can do. An example, from the online help: “A scripted function written by a programmer for you is handcrafted, and may not meet the quality and reliability standards that your application or web site demands.” Um, sure, if you have your neighbor’s 13-year-old write your database apps. Or this: “Hand-coded scripts are difficult to edit and maintain because they contain hundreds of lines of code that may or may not be properly documented. The original programmer may no longer be available, or the logic behind the script may be forgotten.” Apparently Caspio hasn’t heard of version control or documentation or, gasp, even Web frameworks that abstract many of the simple functions that it describes as potentially being convoluted. Caspio’s sales pitch seems to be: “Hey, you don’t know anything about Web database development, and you shouldn’t trust anybody who does.”

Lastly, I believe that anybody who has done any kind of work with data realizes that you learn more about the data by being more involved in its use. Some of the best features on the Congress Votes Database came to us only after we had spent time doing the Web development. So while speed-to-deployment is a very attractive feature, many times it results in a one-and-done approach: just slap it up and move on. This would be less risky if there weren’t smart people outside the media who can and will get their hands on useful data and do a better job with it then we are doing. News organizations can’t afford to rely on an approach that limits their choices and encourages quick but shallow development.

Caspio may have an upside; news organizations may come to realize that the value of putting up a database requires that they invest more time and effort, not less, and that it works better if they have the highest degree of control and flexibility. So maybe Caspio is a stepping-stone, an actual bridge rather than a crutch that people will be using rather than expanding their skills. But right now it sure doesn’t feel that way to me. News executives seeking to be able to tell the bosses that “we have some databases on the site” will find lots to love about a product like Caspio Bridge. The rest of us should take no comfort in that.

Responses

Feed Trackback Address
  1. Matt Waite says:

    September 8th, 2007 at 8:11 pm (#)

    Another problem I see: There are a lot of news organizations that are putting databases online and declaring victory. That somehow, just the act of having a database online to just regurgitate data is somehow “it.” Caspio appears to enable this. Yes, great, I’m happy that news organizations are putting data online where it should be. But is a search box or a couple of drop downs and a submit button really it? Is that all the thought we’re going to put into this? Are simple select queries all we’re going to do with this data? Seems a terrible waste.

  2. Mark says:

    September 9th, 2007 at 12:24 pm (#)

    As a Caspio user who has done Web programming too, my response:

    * I agree it would be better to build superb Web applications along the lines of ChicagoCrime.org, Politifacts, the Congress votes database, etc. But those projects require a lot of resources most newspapers, with shrinking staffs, don’t have. Not all data databases require the Django treatment to have value for the reader.
    * Many newspapers — including mine — don’t have a Web programmer on staff. Caspio lets even non-programmers serve data on the Web in a useful and attractive way. Many of the hand-coded Web databases created by CAR types like ourselves I’ve seen on the Web are butt ugly and poorly designed. Caspio apps at least are attractive to look at and useable, with little effort. Caspio lacks the flexibility you get from doing it yourself, but it’s also much faster.
    * You are too dismissive of Caspio’s argument that it’s code is of higher quality and easier to maintain. I’ve done enough Web programming to know that it is difficult to get right and often bites you in the ass when you don’t expect it. You can spend hours debugging a simple problem. That is especially a problem when you are inexperienced, as most journalist/programmers at most newspapers are likely to be. I’ve looked at many Web databases produced by CAR types on the Web and come across many errors (including, by the way, the Congress Votes Database, which I emailed WP.com about and has since been fixed).
    * Companies outsource data handling to third party vendors all the time. Every time you put a Google map on WP.com, you’re taking a risk Google won’t go down. I can’t conceive of a Web data application that is so “critical and exclusive” it would be, in the grand view of things, that big of a deal if the Caspio servers went down. It wouldn’t be good, but hardly a show stopper. In any event, from where I sit, the newspaper chains have the less reliable infrastructure.
    * The fact that even newspapers with talented Web developers and CAR people — like the Indianapolis Star and the Atlanta Journal-Constitution — have turned to Caspio. There are more valuable projects out there than there are Web programmers, and not all data is worth programming effort, not if there are tools like Caspio available. Our business editor is currently trying his hand at building a Caspio gas price app, and a Web editor put up historical Derby weather data last May. They wouldn’t have done that if I had handed them a text editor and a PHP manual.
    * Personally, I’m amazed doing this stuff is still as hard as it is. The Web world is waking up to the fact that there is strong demand out there to put searchable data on the Web in useful ways. Caspio, the Google Mashup Editor, Microsoft Popfly, Zoho Creator, DabbleDb, etc., are all attempts to make doing that easier, but we’re a long way from having tools the average Joe or Jane in the newsroom can use. Caspio is at least closer — it’s a tool, filling a real need at the moment.
    * When the day comes that Web application programmers are standard issue in news departments, tools like Caspio won’t be needed. I don’t think most people who are doing this are saying this is sufficient, lets stop here. Most are responding from corporate pressure from above to do something — now — which is both good, because it’s long overdue, but bad, because staff skills, resources, etc., aren’t yet aligned with the new goals.
    * Yes, one Caspio app can use data from another, assuming I understand what you’re saying there. Caspio can also take URL parameters passed
    externally. Caspio’s text field limits aren’t a significant problem for modest datasets or sites with modest traffic. I wouldn’t use it for datasets with millions of rows. I don’t know how it would do under really heavy loads, but there are more than a few examples of hand-crafted Web applications that have crashed or slowed to a crawl when getting a heavy spike in traffic. If Caspio has that problem, I haven’t heard of it, not that I necessarily would.

  3. Derek says:

    September 9th, 2007 at 3:21 pm (#)

    Mark,

    Thanks for the reply - I’m glad to have a user perspective. Some responses:

    I don’t think that every dataset requires a complex app, either. But some clearly do. What I’m afraid of, as Matt wrote, is editors not knowing the difference and simply choosing Caspio because it can do the minimum required. If a news organization has little or no capacity to do anything more than Caspio does, then all it will ever do is whatever Caspio allows. Yes, news organizations have to make choices, but given the competition, I don’t believe that having Caspio as your main option is a good idea.

    On code quality: I didn’t say that non-Caspio apps had no bugs, but the language Caspio uses strikes me as not simply saying, “Here’s a way for you to get data online,” but actively discouraging users from using other methods because they are worse.

    Also, when people helpfully point out errors with our database apps, it helps us learn and even generate new features. We added iCal feeds to our Candidate Tracker adds because someone asked for them - it took me a little time to figure out how, but I now could do them in minutes for other apps. I think that’s as least as helpful as posting in a help forum and waiting for the vendor to fix/implement something.

    Yes, we’re a long way from everyone in the newsroom being able to create database applications, but I’m not sure that’s even the proper goal. Although I’d love to see it happen, I doubt it ever will, and that’s not purely a software problem. I’m glad to hear that non-CAR types are learning how to work with databases online, but this idea that current tools are simply beyond our capabilities is damaging to us. I think Matt’s work on Politifact is a strong argument against that.

    My fear is that the people who set the budgets for newsrooms will, in contrast to what you wrote, see Caspio apps as plenty good enough, while all sorts of competitors, known and unknown, do better.

  4. Derek says:

    September 9th, 2007 at 5:03 pm (#)

    One other thing that bugs me about Caspio apps: since their Data Pages are loaded via JavaScript into existing sites, the results aren’t visible to Google. Indeed, a search for the phrase “Powered by Caspio” - which appeared to be required branding - yields 8 results, none from news sites. It makes little sense to build rich data apps when search engines can’t find them.

  5. Jean Dubail says:

    September 10th, 2007 at 9:40 am (#)

    I am the online editor at The Plain Dealer. One thing needs to be clarified in the reference to our Browns database: The data on the Caspio servers includes scores and other information for all games the Browns have played — about 870 of them — but the game stories are NOT in the database itself. The database includes links to those stories, which are on our server.
    In its first four days, by the way, including a Sunday and the Labor Day holiday, users performed more than 20,000 searches on the database. Which is one way of saying, it’s worth taking a look at Caspio before deciding whether it’s worthwhile.

  6. Derek says:

    September 10th, 2007 at 10:30 am (#)

    Jean,

    I appreciate the clarification on the game stories. And I don’t doubt that you’ve had good use of the database, particularly in its infancy. What concerns me is not whether Caspio might fit some needs, but whether it fits every need. Take permalinks, for example: if someone wants to link to a particular game, or even a season’s worth of games, there’s no permanent URL for those, since each search gets a unique session id in the browser.

    Derek

  7. David Milliron (Caspio, Inc.) says:

    September 10th, 2007 at 10:42 am (#)

    Derek,

    Clearly you feel threatened. And for someone who claims to have never tried the product, you sure have developed a lot of opinion. Is that journalism or bias and fear?

    You summed up your entire position when you stated: “I should say that I have not seen Caspio Bridge in person and have only had it described to me.”

    Caspio is a framework. So is Django, Ruby of Rails and the many other products newspapers have used over the years to publish data to the web.

    And our SOAP-compatible Web Services enable developers to enhance Caspio Bridge applications with custom functionality. Web Services can be accessed through virtually any programming or scripting language including ASP, .Net, VB, PHP, C#, Cold Fusion, Java etc.

    As a 20-year veteran journalist and director of the company’s media services group, I could debate every one of your points, but why bother since you have never tried our product. Your post is riddled with inaccurate and misleading information and you owe it to your audience to properly research a product before you pepper the public with misinformation.

    If you or your readers want to try out the product, feel free to take it for a free 14-day test drive at http://www.caspio.com.

    In closing, I suspect your posting has offended the hundreds of media clients use Caspio on a daily basis to achieve rapid time-to-market without requiring advanced IT skills.

    David Milliron
    Director, Media Services
    Caspio, Inc.
    650-691-0900 x741

    —————

    About Caspio, Inc.

    Caspio empowers business users to create and deploy web databases, forms and applications easily and without programming. Caspio’s on-demand platform eliminates coding with intuitive point-and-click wizards, enabling users to rapidly produce web database components for capturing, publishing and managing data online. Caspio shrinks development time from weeks to hours, and from thousands of dollars to a small monthly fee. Caspio’s customers include one-person entrepreneurs to Fortune-500 corporations, government agencies and educational institutions. Caspio is headquartered in Mountain View, California. For more information, please visit http://www.caspio.com and http://www.expressdb.com or call 650-691-0900.

  8. Derek says:

    September 10th, 2007 at 11:18 am (#)

    Hi David,

    It certainly isn’t fear that motivated me to write about Caspio, and I’m glad to have you correct any mistakes I have made in the post. Were my citations of the Caspio docs or forums in error?

    Derek

  9. Pete Zicari says:

    September 10th, 2007 at 1:55 pm (#)

    I see the PD has already been heard from, but as the source of most of the code written in the newsroom, I’m entitled to chime in, too.

    The important elements of the Caspio service are its speed and relative ease of use. If it didn’t exist, I would have a list of assignments that reached out the door — both because other people can now handle Caspio projects and because I just can’t turn an mysql/php project around anywhere near as fast. On July Fourth, we had a spate of killings. In about 90 minutes July 5, I was able to figure out how to build a Google mashup and slap up a Caspio table to serve it — in lieu of a locater map from the art department, which would have taken much longer.

    With complete access to the stylesheet, I can make my Caspio table look like anything I want — that’s a picture from an old-time Browns game (when they won them) in the background of our Browns table. I can interpolate an HTML block if I want to break out of the limits of the Web table Caspio works within, and I can drop in data from the data table as needed.

    The system might not be as flexible as what I can write outside it (but they’re promising dynamic drop-down lists soon) and its docs could stand improving. But because it’s so easy, most of the projects can be done in the CAR office or by some other editor, leaving me free to consult and take up more challenging jobs.

    Few of our databases qualify as huge, and only the Browns table is gettng heavy traffic now. Caspio says they’re running a stable, reliable system and I’ve got no reason to doubt them.

    If you’ve got a big programming staff and highly creative minds building unique apps, then go for it. Myself, I don’t have a problem with cookie-cutter projects when what I need is cookies.

  10. Matt says:

    September 10th, 2007 at 2:35 pm (#)

    We’ve been using Caspio too. As some have already said, it’s good for smaller databases where you don’t want to give the spit and polish shine of your own coding.
    Right now we’re literally using it as a bridge while we get access to our own server and slowly breakdown that wall between us and IT.
    We’ve already said some of the things we’d like to do would be a lot easier to do on our own than in Caspio. But we’re happy with what Caspio is letting us do now.

    Is Caspio our only web option for databases? Right now yes, but that’s more due to the wall thing.
    Is it helping us convince the higher ups that databases generate traffic and are worth doing? Yes.
    Do we tell our higher ups we could do a lot more if we were doing this ourselves? Heck yes.
    Is that plan working? Looks like. We get closer everyday to getting our own server with no restrictions. We’ve got a fair amount of PHP under our belts and are now working on developing in Django too. If you told me that we’d be doing that three years ago I would have laughed until I passed out.

    I don’t want our site falling behind or our readers missing out though while we wait for inside baseball stuff to be resolved until we can do it ourselves.

  11. Wynn says:

    September 10th, 2007 at 4:08 pm (#)

    The way I see it, (and pardon me while I muddle my metaphors), it’s a double-edged blade. One the one hand, where all you need is a database up and up fast, Caspio gives you all you need to do. And it’s finest selling point — the rapid development time — is a big advantage in time- (and staff-) starved newsrooms.

    But it has the same drawbacks as any GUI, in that it effectively gives us blinders. A newsroom development team that focuses on Caspio’s slick Access-to-web functions isn’t going to see the possibilities for data to create Politifact or ChicagoCrime or any of the other holy grails of data journalism.

    It’s not an all-or-nothing thing. At the News-Leader we developed an app that mimics Caspio. It allows us to get data up, fast, which makes for better coverage (and without that monthly line-item on the budget). But it also leaves us more time to work with reporters and building within other, less restrictive frameworks. And by going through MySQL, it leaves those avenues of development open in a way that Caspio simply doesn’t.

    Of course, this has the added side-effect of lots of obnoxious e-mails to Derek, Schaver and a couple of the other wisened folk who troll this site, but I think it’s a good place to be.

  12. PI Buzz - Private Investigator | Public Records | Internet Search | Privacy | Reporting | Personal Information | Adoption | Genealogy | » Utah newspaper brings public records online says:

    September 11th, 2007 at 2:32 am (#)

    [...] You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site. Leave aReply [...]

  13. Jonathan says:

    September 11th, 2007 at 9:22 am (#)

    This is the first time I’ve visited your blog, and I must say you went to a lot of trouble to bash Caspio for someone who hasn’t used it. Thankfully, your review contains plenty of links to let the rest of us decide for ourselves.

  14. David Milliron (Caspio, Inc.) says:

    September 12th, 2007 at 12:45 am (#)

    Derek:

    Here’s a competitor’s recent review of our product: Click Here

    Information about our product, features, security, etc., can be found at: Click

    Here

    Detailed information on pricing can be found at: Click Here

    And lastly, information on our customizations and add-ons, including our Microsoft Office Plug-In and SOAP-compliant web

    services, which are included with every account, can be found at: Click Here

    Should you have any questions about our product and wish to do a review, feel free to download our press kit at: Click Here

    I am also providing you the name and number for our press contact: Melissa Young, (650) 691-0900 x 525 or via e-mail.

    Should you have any additional questions, please do not hesitate to contact me directly.

    Regards,

    David A. Milliron
    Director of Media Services
    Caspio, Inc.
    485 N Whisman Rd, Ste 200
    Mountain View, CA 94043
    http://www.Caspio.com
    (650) 691-0900 x741

  15. Glen McGregor says:

    September 12th, 2007 at 4:27 pm (#)

    I’ve never used Caspio but if it can make coding online databases easier, I’d be willing to try.

    I’ve struggled in the past to get tech people in our news chain to port our CAR data to the website. It’s time consuming for them.

    I’m trying to learn to write PHP but ultimately, that’s a waste of time I should be using to report.

    Anything that can cut out the IT guys and give reporters control over the content helps.

  16. Derek says:

    September 12th, 2007 at 9:59 pm (#)

    Jonathan: Thanks for stopping by; feel free to let me know how you evaluate Caspio or any other product.

    Glen: Yes, putting CAR data online can be time-consuming. I guess it’s just a matter of whether or not it’s important to them. There also are other frameworks that don’t require weeks or months in order to put data online.

    Derek

  17. Jeff Glass says:

    September 21st, 2007 at 1:42 am (#)

    I’m not in the journalism trade. I’m in financial services (mortgage, to be specific). I stumbled across this blog and found the discussion interesting. I’ve been a Caspio user for about two years. I have practically no programming background. I continue to be amazed at the power of Caspio to free power users from the limitations of developer-controlled projects. I have built four applications using Caspio. I also recently was involved in a large ASP/.NET project with a big development team. The contrast between a DIY Caspio project and a full-fledged development project is breathtaking. I developed a feature-rich, data-driven web site capable of scaling to fairly massive storage and performance requirements. With Caspio I did it myself in a couple of weeks, with a little help from Caspio to get through a few complexities. The same project would have taken three or four times as long if done in the traditional way, and at much greater expense. I also would like to say that working with Caspio is not only easy for do-it-yourself types, but they have a superb support staff that goes out of its way to help users at all levels to get their applications built fast. I will engage in other projects that don’t use Caspio, but for certain types of projects Caspio is a dream come true.

  18. MTS » ONLINE DATABASE NEWS TOOLS says:

    October 12th, 2007 at 10:23 am (#)

    [...] Blog comments on Caspio [...]

  19. CaspioVote Post- Clarification | Jonathan Coffman | Blog says:

    November 14th, 2007 at 7:42 pm (#)

    [...] My post earlier today regarding my initial impressions of the CaspioVote election guide has unfortunately caused a little stir. It has not been deleted as of this posting.I have emailed Caspio asking for clarification and a demo of the product.  I will wait to respond further until I have some documentation from them to either counter or confirm the information I was able to find regarding their latest web application. It has come to my attention that they are unhappy with what was written and feel that it was an unfair assessment of their application.  All I can say at this point is that I took what information was available in the publicly accessible portions of their website, including links, their press release published today, and a quick 5-minute review of the source page of the one example I was able to find on a live site and my observations were made from that research.Obviously there is not much out there at this point, which I think is why it’s important to take an early look for an initial evaluation and it very well may change my outlook once we’ve setup a time for a demo.One of my primary interest areas is new-media and content delivery. Web applications that make that delivery easier and more accessible to more people are constantly on my radar. Fortunately for me, I’m in a position where I can try out and sample many applications and share the knowledge gained in a conversation with a community.CaspioVote is something that interests me personally and professionally which is why I took the time to look at how it can be used and how it works.If you’re interested in more information about Caspio in general and other conversations that have taken place check out these links:http://www.caspio.com/vote/  http://www.customerthink.com/news/caspio_named_finalist_in_cnet_webware_100 http://commonsensej.blogspot.com/2007/10/caspio-dustup.html http://www.bloggingstocks.com/2007/06/10/hyper-local-apps-to-save-the-newspaper-biz/http://www.jacobian.org/writing/2007/sep/12/db-journalism/http://www.thescoop.org/archives/2007/09/07/outsourcing-database-development-or-the-caspio-issue/ http://www.thescoop.org/archives/2007/09/12/on-trials-software-and-otherwise/    [...]

  20. Deciding our own fate: Newspapers must approach outsourcing intelligently to innovate and survive : William M. Hartnett says:

    November 28th, 2007 at 5:36 am (#)

    [...] consider database development. Or commenting platforms. My point is that an unthinking reliance on vendors to handle our most [...]

Leave a Response

Recent Comments

  • John Zhu on On Bomb-Throwing
  • Benj. on Caspio’s Lessons
  • palewire / Permalinks, low-rent data viz and other stupid Caspio tricks. on Trial By Caspio
  • palewire / Permalinks, low-rent data viz and other stupid Caspio tricks. on Caspio’s Lessons
  • Aron Pilhofer on Caspio’s Lessons

Recent Posts

  • The Birth of Quadruplets, or Understanding the Process
  • DjangoCon
  • Caspio’s Lessons
  • The Future of News Libraries
  • SLA Wrap-Up


©2008 The Scoop
Powered by WordPress using the Gridline Lite theme by Graph Paper Press.