Blogging for Dollars
How would you like to survey 20 million consumers in two minutes?
By Justin Martin


(FSB Magazine) - The Blogosphere is a vast, unruly, and totally tantalizing mother lode of unvarnished consumer opinion on every product and service in the capitalist universe. But to know what the masses are saying about your product, you would have to dig through 350,000 fresh daily postings on a staggering 20 million blogs worldwide (90% of them are based in the U.S.). And that's just the beginning. Roughly 50,000 new blogs are launched every day.

Let's say you're still determined to find out how your new gizmo is being received. You could hire a couple of teenagers to Google the product name all day and compile a digest of all the blog citations. But they would probably end up with a vast amount of material, particularly for mass-market products such as a new line of blue jeans or brand of ice cream. And it would be difficult to draw meaningful conclusions from the quotes because bloggers don't always reveal basic information about themselves (such as their age and gender).

Enter Umbria, a market research firm in Boulder that designs software to find useful consumer intelligence on the Internet. "The blogosphere is overflowing with brutally honest opinion," says Howard Kaushansky, Umbria's 47-year-old CEO. "Our goal is to track those opinions down."

Every few hours Umbria (http://www.umbrialistens.com) sends an application called a spider out over the web to scour the blogosphere for postings about the firm's clients, most of which are big consumer companies, such as Electronic Arts, SAP, and Sprint. By analyzing keywords in blogs, Umbria can classify each citation thematically. In the case of Sprint, for example, Umbria's software can tell whether a blogger is talking about customer service, the company's advertisements, or a particular calling plan.

Another big challenge is to decipher what's on a blogger's mind. To figure out whether an opinion is strong or tepid, for example, it helps to know that "awesome" is a stronger endorsement than "pretty cool," and that "shoddy" is less damning than "abominable." Umbria has several employees with Ph.D.s in linguistics and artificial intelligence who are forever tweaking the software to make it better at categorizing opinions.

Kaushansky claims his software can even identify sarcasm, a useful skill in the prickly blogosphere. Consider this statement: "The Treo 650 is the greatest phone in the world ... NOT!" Umbria's language-parsing software is "trained" to classify that and other common sarcastic turns of phrase as negative sentiments about the client. "Sarcasm is difficult for people to pick up, let alone machines," says Kaushansky. "But it's very valuable from a market research standpoint because it tells you how a customer really feels."

The software can also estimate the author's age and gender. Elongated spellings ("soooooooo"), multiple exclamation marks (!!!), and acronyms such as POS ("parent over shoulder") suggest a teenage female member of Generation Y (born after 1979). The blogger is probably a teenage boy if a posting is rife with hip-hop terminology such as "aight" (translation: "all right") and "true dat" ("I agree!").

The twenty- and thirty-something members of Generation X are more likely to use complete sentences. Gen X men also tend to favor satiric jibes and vivid adjectives such as "sordid" and "hilarious." Gen X women favor elaborately emotive turns of phrase, such as "wishing I could just crawl out of my skin and go on without it" (a real example). Male baby-boomers, on the other hand, tend to favor stale hip-hop-isms such as "jiggy" and "bling." They also pepper their blogs with terms such as "prostate" and "IRA."

Umbria's service is warp-speed quick. It usually takes less than a minute for the spider to crawl through those 20 million blogs. That's followed by a few additional minutes spent running linguistic algorithms on any relevant blog entries. Then out spits an "Umbria Buzz Report" that tells clients how they are being portrayed in the blogosphere. The reports cover the overall brand experience, along with consumer reactions to specific products and even specific features of those products. Umbria also tallies the number of comments, classifies all of them by estimated age and gender, and gauges whether blog sentiment is skewing positive or negative. In its reports, available weekly or monthly, Umbria always makes a point of reproducing a few of the juiciest blog postings verbatim (see the box above).

Bloggers are often early adopters of products and services, according to Kaushansky, and they tend to be more fervent and expansive in their opinions than the general population. Clients say Umbria's service helps them discern attitudes that may not show up for months using traditional market-research tools such as surveys and focus groups. Working on behalf of U.S. Cellular (http://www.uscc.com), for example, Umbria trawled the blogosphere and provided an early read that teens were especially anxious about exceeding their calling-plan minutes. The reason: They worried that their parents might charge them for overages.

Buzz Reports are infinitely customizable. Umbria can gauge the response to a specific product launch or an advertising campaign. The service is also useful for gathering intelligence on competitors. Izze Beverage (http://www.izze.com), a 45-employee Boulder company that makes sparkling fruit juices, recently engaged Umbria to track what bloggers were saying about rival brands. The exercise was a revelation, according to CEO Todd Woloson. When a blogger had a bad fruit juice experience with one of his competitors, the result was often a profane online rant. "We want to make sure that never happens to us," says Woloson. The company recently hired a customer relations specialist that it hopes will soothe angry consumers before they take to their blogs.

RAW INTELLIGENCE

Umbria's software can guess bloggers' age and gender.

"Went to Burger King. Tried the new Angus Burger... didn't much care for it. Guess the 'g' was a typo..." --Gen X male (28-40)

"i'm very happy right now!!! my mom just bought me a new phone!!!!! ... so i'm finally getting out of nextel which i'm happy about because lately it's been giving me madd problems and i'm ready to throw it on the floor and break it ... wow I sound like my dad, how freaky!!!" --Gen Y female (15-18)

"Queen! You Know, the band? God, I'm still young. Where thehell is Queen? At least I found The Motels. I am so lonely at iTunes. Where are my people (peeps?) going to get music?" --Boomer Male (41-60)

Kaushansky is a former lawyer who worked at several small data-mining companies before founding Umbria in 2004. He has raised $6.75 million for the company, mostly from venture capitalists. Kaushansky projects that Umbria will generate $2 million in revenue this year. He expects the company to achieve profitability in 2006.

Umbria is a relatively small player in the $20 million blog research market, with a 10% share. Principal rivals include Cincinnati-based Intelliseek (http://www.invisible.com), which controls about a third of the market, and BuzzMetrics in New York, which does not disclose revenues. The latter two companies also crawl the blogosphere on behalf of corporate clients. But Umbria's solution is entirely software-based. Kaushansky's competitors also meet with clients to interpret the data and suggest strategic responses. "Ultimately we rely on both technology and humans for analysis," says Max Kalehoff, marketing director for BuzzMetrics. "Umbria takes an extremely automated approach."

Automation is the source of Umbria's competitive edge: affordability. Companies pay roughly $60,000 a year for its service. By contrast, the fee to engage one of its rivals can easily run into the seven figures. Kaushansky intends to maintain Umbria's low-cost and no-consultants strategy. The company's next frontier: algorithms that will classify bloggers by ethnicity, location, income, social class, and level of education. As a white, female, middle-class, college freshman living in Akron might say on her blog, that would be soooooo cool!!!!!!! Top of page

Most stock quote data provided by BATS. Market indices are shown in real time, except for the DJIA, which is delayed by two minutes. All times are ET. Disclaimer.

Morningstar: © 2014 Morningstar, Inc. All Rights Reserved.

Factset: FactSet Research Systems Inc. 2014. All rights reserved.

Chicago Mercantile Association: Certain market data is the property of Chicago Mercantile Exchange Inc. and its licensors. All rights reserved.

Dow Jones: The Dow Jones branded indices are proprietary to and are calculated, distributed and marketed by DJI Opco, a subsidiary of S&P Dow Jones Indices LLC and have been licensed for use to S&P Opco, LLC and CNN. Standard & Poor's and S&P are registered trademarks of Standard & Poor’s Financial Services LLC and Dow Jones is a registered trademark of Dow Jones Trademark Holdings LLC. All content of the Dow Jones branded indices © S&P Dow Jones Indices LLC 2014 and/or its affiliates.

Most stock quote data provided by BATS. Market indices are shown in real time, except for the DJIA, which is delayed by two minutes. All times are ET. Disclaimer.

Morningstar: © 2014 Morningstar, Inc. All Rights Reserved.

Factset: FactSet Research Systems Inc. 2014. All rights reserved.

Chicago Mercantile Association: Certain market data is the property of Chicago Mercantile Exchange Inc. and its licensors. All rights reserved.

Dow Jones: The Dow Jones branded indices are proprietary to and are calculated, distributed and marketed by DJI Opco, a subsidiary of S&P Dow Jones Indices LLC and have been licensed for use to S&P Opco, LLC and CNN. Standard & Poor's and S&P are registered trademarks of Standard & Poor’s Financial Services LLC and Dow Jones is a registered trademark of Dow Jones Trademark Holdings LLC. All content of the Dow Jones branded indices © S&P Dow Jones Indices LLC 2014 and/or its affiliates.