Inventing a new and dynamic approach to computer security
Legend has it Sir Isaac Newton was idling beneath an apple tree when a falling piece of fruit caught his attention, planting the seeds of his theory of gravity. While Thayer School of Engineering Professor George Cybenkos new approach to organizing and retrieving information evolved from less fanciful origins, he admits the basic idea was something of an epiphany.
The technology Cybenko is working on is a radical departure from traditional information retrieval methods. Historically, he says, people have searched for information they need by applying an Aristotelian approach. That strategy, like the ancient philosophers logic, is based on premises or givens. If you have ever attempted to retrieve data from the Internet through a Web search engine, you have built the rules for your search by writing a Boolean expression. For example, you might ask for any documents containing the words Dartmouth and engineering. That kind of approach
doesnt work in todays large-scale, complex information environment, according to Cybenko.
Last year, he realized the reason a lot of approaches were failing or stalling is that they literally were trying to model the world the way Aristotelian science did up to the 1500s or 1600s. It failed because it just couldnt explain nature, until Newton came along with a whole different process-oriented way to view the world in terms of states and dynamics.
Following Newtons example, the Process Query System (PQS) paradigm that Cybenko has developed focuses on processes rather than rules or static expressions. It describes a sequence of steps and evidence that supports transitions between those steps. Cybenko and his research team of graduate students, postdoctoral students and research scientists have even developed a working prototype of a PQS called TRAFEN (TRAcking and Fusion ENgine).
One of the specific applications Cybenkos group has investigated is the detection of computer worms. A worm is a self-propagating code that goes through stages wherein it (1) finds machines or other computers, (2) moves to those machines, (3) infects them, and (4) modifies files on those machines. Different worms use different mechanisms to self-propagate, but all go through these stages, which are characterized by specific indicators or behavior.
The figure depicts a simple model of a dynamic process where internal, nonobservable states (A,B,C) emit events (a, b, g) but not in a way that these events are uniquely associated with the hidden states. A Process Query System observes sequences of events and builds associations between them and sequences of hidden states of the underlying process.
TRAFEN uses detailed descriptions of this behavior to locate computer infections. In other words, This is what a worm does; find anything that has that behavior. The more thorough the process described, the fewer false alarms will show up, although Cybenko cautions that being too specific could overlook some worms altogether. Another application under development involves vehicle tracking in a network of acoustic sensors. Ultimately, Process Query Systems approaches are more efficient and scalable than Aristotelian approaches, which depend on rule-based processing of individual observations.
The PQS breakthrough is an idea whose time has come, says Cybenko. Its only become obvious in the last few years that existing solutions arent scaling with the scope of the problem because our ability to network and collect a lot of information is relatively new. Although, in principle, PQS could have been developed 10 years ago, the computing power and motivation to drive the need for a new approach were not quite there. Computing power today is very small and cheap, with established Web and Internet standards, so its very easy now to connect many different devices to a big network. All of a sudden, you can collect and contemplate the ability to aggregate very quicklyin real timea lot of information. So, how do you do that? People havent really been thinking about that much. Its a new kind of problem.
He credits the interdisciplinary composition of his team, whose expertise integrates computer science, communication, and mathematics, with the timely technology. Researchers tend to be disciplinary so if youre a computer scientist trying to tackle these problems, you probably dont have the systems theory or the electrical engineering background thats necessary to think the Newtonian way, he says. The flip side is, if youre an electrical engineer, you probably dont realize that there are problems in computer security or network management that can be formulated in terms of process detection. So I think our success is largely due to the right mix of people with the right mix of backgrounds.
He adds that they have been working on related areas for five or six years, so weve built up a solid repertoire of ideas, technologies and knowledge about whats going on.
The teams novel approach to information retrieval already has stimulated market interest. The U.S. government is particularly interested in applying the findings. In fact, grants from the Advanced Research and Development Activity, the Department of Homeland Security, and the Defense Advanced Research Projects Agency supported the research. Commercial software companies are also expressing interest in developing products around the PQS ideas.
PQS can be used to detect various types of physical behaviors in the environments, such as vehicles moving in a region and plumes of airborne chemical or biological agents, which offers obvious security advantages.
Beyond that, he says, the government and others are interested in the research because people are recognizing that in many applications, like infrastructure monitoring and large-scale sensor networks, nobody really has a good idea of how to proceed because theyre still thinking the Aristotelian way.
It is, he adds high-risk, high-payoff research. Were trying to change how people have been doing things for a long time.