Search Engines, Investigations, and Identifying Useful Data

Information, information everywhere and not a drop to…

In my previous post, I touched on the evolution of the investigations industry, specifically the ways in which the Internet, by making so much information readily available, presents a certain kind of challenge to investigative professionals. Pre-Internet, the challenge was finding information; now, the challenge is sorting through it, deciding when you have enough, and determining what is good, bad, useful or distracting. We in the investigative and business intelligence sector are great at discovering information about people, companies, groups and organizations, but what separates good investigators from the pack is the ability to filter, focus, and stay on track.

My years of investigative experience have taught me that one of the most important ways to start a case is to set concrete goals. It seems obvious, but sitting down with your team and defining what you’re really looking for up front is vital to the success of a case. This discipline is particularly important nowadays, when search engines are so important to our process but inevitably suggest words and phrases that threaten to send investigators off in different directions and on unpredictable tangents. (It’s about to get harder, too: Google’s latest “real-time”search technology, while impressive and useful, has the potential to be hazardously distracting. Every letter you type reveals a new set of results, so by the time you’ve typed “John Smith”, you’ve had to avoid clicking on links about Jet Blue, Jones Beach, John Mayer, Jon Stewart, and Captain John Smith.)

Ok, I know — exploring those tangents and wandering off in different directions does get interesting at times, but we will talk about that in another post. Bottom line: if you don’t spend some time talking and thinking about search parameters and investigative goals, you may be unpleasantly surprised with the results.

The first filter we apply as results start to flood in is an analysis of sources – for example, we might glean some information about a subject’s business from Facebook, but it probably won’t be as reliable (hence as valuable) as a set of online SEC filings. In other cases, the opposite might be true.

We also need to question the information provider’s motivation. Who provided the data, and when? What was the ostensible purpose? This is not to say that you should discount apparently biased or distorted information — you just need to look at it more closely because it adds to the context of your search.

Even authoritative sources can lead to “false positives”, or complementary results that create the illusion of substantiation. In many cases, several independent sources will seem to separately confirm a piece of information, when in fact they have both relied on a single primary source. (Think about two news sites that base breaking news articles on the same erroneous wire report.) That’s why we at K2 Global spend so much time on deep analysis — it’s the only way to turn an ever-increasing abundance of raw information into useful knowledge.

Shouldn’t we be able to trust online opinions?

“Discerning Internet users know that glowing online reviews of things like books or restaurants cannot always be trusted.”  That’s the first sentence of last week’s New York Times story on the Reverb Communications settlement. (Reverb, a P.R. firm, apparently had its employees pose as ordinary consumers and plant positive reviews of its clients’ games in the iTunes store.)

The Times, of course, is right. We are so used to calculating whether online reviews are real — and more broadly whether people and organizations represent themselves honestly online — that we don’t even think about it anymore. It’s as if our online lives put us in a constant state of low-grade suspicion and mistrust.

I spend a lot of time thinking about this issue and working on ways to fix it — in fact, we’re building a practical technology solution that individuals and organizations with similar concerns should find quite helpful. It’ll be a better day when we can trust that people are representing themselves honestly online.

A Thought on Investigative Technologies

“What does K2 actually do?”

It’s a question I like to be asked, but one that is difficult to answer succinctly.

I often say that we gather information to help our clients make informed decisions about their business. That could cover anything from making an investment, to completing a transaction or even to launching a new product. I could talk about how the information to make a decision needs to be good, accurate, complete, reliable, and so on, but most people who ask already understand the basics of investigation.

What is more complex, and what challenges us daily, is managing all the information that we gather.

When I first started out in the business, we did not have the internet.

Gasp, I know.

While it may come as a shock to the younger generation of investigators, we didn’t just hop online when we used to start cases. In my early days, we didn’t even have computers. We had desktop word processors. The simple green glow of their screens was an upgrade for us, but even most ATM screens now are more complex.

For the most part, all you could do was type. We could do some limited media searches, there were corporate registries and libraries, a lot of paper indexes to review, all of which had been laboriously compiled by hand.

Fast forward to now and each of us now has access to huge volumes of information from an endless number of sources and perspectives. Just about everything and everybody is online somewhere, and the dynamic growth in the amount of information available only complicates things further.

Just about anything published now is instantly available online, and what has been published in the past is rapidly being scanned and uploaded. Soon enough the term “out of print” will mean the printer is off-line and little else.

Where analysts and case managers used to have to deal with a small trickle from computers, we are now dealing with a fire hose of information. We need to sort through it all, evaluate it, decide what is good, credible information, what the interests are behind the information, what the context is, when to keep looking and when to say “Stop, I have enough”. Information management is a major part of how we plan out our case load.

In the next couple of posts I will be exploring the information explosion and how we manage it. I’ll be discussing the problems and challenges we face, examining techniques and approaches that we are trying and pointing out some of the trends that we are seeing in the industry.