The word forensic comes from the Latin word forensis meaning “of or before the forum.” In ancient Rome, an accused criminal and the accusing victim would present their cases before a group in a public forum. In this very general sense it was not unlike the modern U.S. legal system where plaintiffs and defendants present their cases in a public forum. Of course, the rules and procedures of the presentation, of which there are very many, differ from those days. Also, whether in a civil trial or a criminal trial, all parties can be represented by lawyers trained in the intricacies of these rules and procedures.
At these ancient Roman forums, both parties would present their cases to the forum and one party would be declared a winner. The party with the better presentation skills, regardless of innocence or guilt, would often prevail. The modern system relies on the fact that attorneys representing the parties make the arguments rather than the parties themselves. The entire system relies on the assumption that lawyers, trained in law and skilled at presenting complex information, will present both parties’ cases in the best possible manner and that ultimately a just outcome will occur.
I don’t want to say that the truth will prevail, not only because that’s a cliché but because there is often some amount of truth in the arguments of both parties. Rather, more often than not, justice will be served.
This model works very well—not perfectly, but very well. With regard to highly technical cases, however, the percentage of cases where justice is served is lower because the issues are difficult for judges and juries to grasp. Technical experts can throw around highly technical terms, sometimes without realizing it and other times to purposely confuse a judge or jury. This is why two things are required to improve the analysis of software for the legal system:
Create a standard method of quantizing software comparisons.Create a standard methodology for using this quantization to reach a conclusion that is usable in a court of law.
These two things are embodied in what is called “software forensics.” Before we arrive at a working definition, let us look at the definitions of related terms: “forensic science,” “forensic engineering,” and “digital forensics.”
The Need for Software ForensicsSome years ago, when I had just begun developing the metrics described in The Software Detective's Handbook, - as well as the software to calculate the metrics, and the methodology to reach a conclusion
based on the metrics - I was contacted by a party in a software copyright dispute in Europe.
A software company had been accused of copying source code from another company. The software implemented real-time trading of financial derivatives. A group of software engineers had left one company to work for the other company; that’s the most common circumstance under which software is
stolen or alleged to have been stolen.
The plaintiff hired a well-known computer science professor from the Royal Institute of Technology, Stockholm, Sweden, to compare the source code. This respected professor, who had taught computer science for many years, reviewed both sets of source code and wrote his report.
His conclusion could be boiled down to this: “I have spent 20 years in the field of computer science and have reviewed many lines of source code. In my experience, I have not seen many examples of code written in this way. Thus it is my opinion that any similarities in the code are due to the fact that code
was copied from one program to another.”
Unfortunately for the plaintiff, the defendant responded by hiring another well-known computer science professor. This person was the head of the computer science department at the very same Royal Institute of Technology, the first professor’s boss.
This professor compared the source code from the two parties, and essentially her conclusion was this:"I have spent 20 years in the field of computer science and have reviewed many lines of source code. In my experience, I have seen many examples of code written in this way. Thus it is my opinion that any similarities in the code are due to the fact that these are simply common ways of writing code.”
The defendant did some research and came across my papers and my CodeSuite software . The defendant hired me, and I ran a CodeMatch comparison and then followed my standard procedure. CodeMatch revealed a fairly high correlation between the source code of the two programs. However, there were no common comments or strings, there were no common instruction sequences, and when I filtered out common statements and identifier names I was left with only a single identifier name that correlated.
Because the identifier name combined standard terms in the industry, and both programs were written by the same programmers, I concluded that no copying had actually occurred.After writing my expert report, what struck me was how much a truly standardized, quantified, scientific method was needed in this area of software forensics, and I made it my goal to bring as much credibility to this field as
there is in the field of DNA analysis, another very complex process that is well defined and accepted in modern courts.
According to the Merriam-Webster Online Dictionary, science is defined as “knowledge or a system of knowledge covering general truths or the operation of general laws especially as obtained and tested through scientific method.”
Forensic science is the application of scientific methods for the purpose of drawing conclusions in court (criminal or civil). The first written account of using this kind of study and analysis to solve criminal cases is given in the book entitled Collected Cases of Injustice Rectified, written by Song Ci during the Song Dynasty of China in 1248.
In one case, when aperson was found murdered in a small town, Song Ci examined the wound of the corpse. By testing different kinds of knives on animal carcasses and comparing the wounds to that of the murder victim, he found that the wound appeared to have been caused by a sickle. Song Ci had everyone in town bring their sickles to the town center for examination. One of the sickles began attracting flies because of the blood on it, and the sickle’s owner confessed to the murder. This groundbreaking book discussed other forensic science techniques, including the fact that water in the lungs is a sign of drowning and broken cartilage in the neck is a sign of strangulation. Song Ci discussed how to examine corpses to determine whether death was caused by murder, suicide, or simply an accident.
In modern times, the best-known methods of forensic science include finger-print analysis and DNA analysis. Many other scientific techniques are used to investigate murder cases—to determine time of death, method of death, instrument of death—as well as other less criminal acts. Some other uses of forensic science include determining forgery of contracts and other documents, exonerating convicted criminals through ex post facto examination of evidence that was not considered at trial, and determining the origins of paintings or authorship of contested documents.
Forensic engineering is the investigation of things to determine their cause of failure for presentation
in a court of law. Forensic engineering is often used in product liability cases when a product has failed, causing injury to a person or a group of people. A forensic engineering investigation often involves examination and testing of the actual product that failed or another copy of that product.
The examination involves applying various stresses to the product and taking detailed measurements to
determine its failure point and mode of failure. For example, a plate of glass at a very high temperature, when hit by a small stone, might chip, shatter, or crack in half. This kind of examination would be useful for understanding how a car or airplane windshield failed. The investigation might start out to replicate the situation that led to the failure in order to understand what factors might have combined to cause it.
Forensic engineering also encompasses reverse engineering, the process of understanding details about how a device works. Thus forensic engineering is critical for patent cases and many trade secret cases.
Two of the most famous cases of forensic engineering involved the Challenger and Columbia space shuttle disasters.
On January 28, 1986, the space shuttle Challenger exploded on takeoff, killing its crew. President Ronald Reagan formed the Rogers Commission to investigate the tragedy. A six-month investigation concluded that the O-rings — rubber rings that are used to seal pipes and are used in everyday appliances like household water faucets — had failed.
The O-rings were designed to create a seal in the shuttle’s solid rocket boosters to prevent superheated gas from escaping and damaging the shuttle. Theoretical physicist Richard Feynman famously demonstrated on television how O-rings lose their flexibility in cold temperatures by placing rubber O-rings in a glass of cold water and then stretching them, thus simplifying a complex concept for the public.
Further investigation revealed that engineers at Morton Thiokol, Inc., where the O-ring was developed and manufactured, knew of the design flaw and had informed NASA that the low temperature on the day of the launch created a serious danger. They recommended that the launch be postponed, but NASA administrators pressured them into withdrawing their objection.
On February 1, 2003, the space shuttle Columbia disintegrated over Texas during reentry into the Earth’s atmosphere. All seven crew members died. Debris from the accident was scattered over sparsely populated regions from southeast of Dallas, Texas, to western Louisiana and southwestern Arkansas.
NASA conducted the largest ground search ever organized to collect the debris, including human remains, for its investigation.The Columbia Accident Investigation Board, or CAIB, consisting of military and civilian experts in various technologies, was formed to conduct the forensic examination.
Figure 1: Challenger space shuttle: the crew and physicist Richard Feynman demonstrating the
breakdown of the O-ring that was determined to be the cause
Amazingly enough, Columbia’s flight data recorder was recovered in the search. Columbia had a special flight data OEX (Orbiter Experiments) recorder, designed to record and measure vehicle performance during flight. It recorded hundreds of different parameters and contained extensive logs of structural and
other data that allowed the CAIB to reconstruct many of the events during the last moments of the flight. The investigators could track the sequence in which the sensors failed, based on the loss of signals from the sensors, to learn how the damage progressed.
Six months of investigation led to the conclusion that a piece of foam that covered the fuel tank broke off during launch and put a hole in the leading edge of the left wing, breaching the reinforced carbon-carbon (RCC) thermal protec-tion system that protected the shuttle from the extreme heat (2,700°C or 5,000°F) during reentry.
Figure 2 : Columbia space shuttle: the crew and a scene during reentry from the recovered on-board shuttle video
“Digital forensics” is the term for the collection and study of digital data for the purpose of presenting
evidence in court. Most typically, digital forensics is used to recover data from storage media such as computer hard drives, flash drives, CDs, DVDs, cameras, cell phones, or any other device that stores information in a digital format, for the purpose of determining important characteristics of that data
that are useful in solving a crime or resolving a civil dispute.
These characteristics might include the type of data (e.g., pictures, emails, or letters) or the owner of the data, or the date of creation or modification of the data. Digital forensics does not involve examining the
content of the data, because that requires skills that are not necessarily computer science. For example, a digital forensic examiner may be able to recover a deleted email from an invest-ment banker about a publicly traded company. However, it would take someone familiar with banking and banking regulations to determine whether the content of the email constituted illegal insider trading.
Digital forensics often involves examining metadata, which is the information about the data rather than
the content of the data. For example, while the content of an email may give facts about insider trading by an investment banker and thus be useful evidence for criminal proceedings against that banker, the metadata might show the date that the email was created. If the banker was on vacation that day, this digital forensic information might be evidence that the banker was being framed by a colleague. Proving or disproving such an issue is a key component of the investigative part of digital forensics.
Digital forensic examiners often inspect large and small computer systems to look for signs of illicit access or “break-ins.” This can involve examining network activity logs that are stored on the computers. It may involve searching for suspicious files that meet certain well-known profiles and that are used to attack a system, or it may involve looking at files created at the time of a known break-in. It may also involve actively monitoring packets traveling around a network.
Techniques employed by digital forensic examiners include methods for recover-ing deleted and partially
deleted files on a computer hard disk. They also include comparing files and sections of files to find sections that are bit-by-bit identical.
Other techniques include recovering and examining metadata that gives important information about the
creation of a file and its various properties. Automatically searching the contents of files and manually examining the con-tents of files are also important techniques in digital forensics.
Digital forensic examiners must be very careful about how data is extracted from a computer so that the
data is not corrupted while the extraction is taking place. Operating systems typically maintain important metadata about files, and any modification of a file, such as moving or copying it for the purpose of examining it, will change the metadata.
For this reason, special techniques and special hardware have been developed to preserve the contents of
computer disks prior to a forensic examination. This can be particularly tricky when the system being examined is used in an active business, such as an online retailer, or in a critical system, such as one that controls a medical device and must operate 24/7. In these cases, special techniques, special hardware, and special software have been developed to extract data from such a live system.
Evidence procedures, such as how an item or information is acquired, docu-mented, and stored, are very important. An examiner should be able to show what procedures were used or not used, to collect the evidence, and to show how the evidence was stored and protected from other parties.
Digital forensic examiners must also be very concerned about documenting the chain of custody, which is the trail of people who handled the evidence and the places where it has been stored. In order to reduce the chance of evidence tampering, and to relieve any doubts in the mind of a judge or jury,
the chain of custody must be well documented in a manner that can be verified.
Software forensics is the examination of software for producing results in court; it should not be
confused with digital forensics. There are times when digital forensic techniques are used to recover software from a computer system or computer storage media so that a software forensic examination can be per-formed, but the analysis process and the methodology for finding evidence are much different.
Unlike digital forensics, software forensics is involved with the content of the software files, whether those files are binary object code files or readable text source code files.
The objective of software forensics is to find evidence for a legal proceeding by examining the literal expression and the functionality of software. Software forensics requires a knowledge of the software, often including things such as the programming language in which it is written, its functionality, the system on which it is intended to run, the devices that the software controls, and the processor that is executing the code.
Whereas a digital forensic examiner attempts to locate files or sections of files that are identical, for
the purpose of identifying them, a software forensic examiner must look at code that has similar functionality even though the exact representation might be different. In patent and trade secret cases, functionality is key, and two programs that implement a patent or trade secret may have been written entirely independently and look very different.
In copyright and trade secret cases, software source code may have been copied but, because of the
normal development process or through attempts to hide the copying, may end up looking very different. Digital forensic processes will not find functionally similar programs; software forensic processes will. Digital forensic processes will not find code that has been significantly modified; software forensic
Thoughts on Requirements for Testifying
In recent years I have been frequently disturbed by the poor job done by some experts on the opposing side of cases I have worked on. Sometimes the experts do not seem to have spent enough time on the analysis, most certainly because of some cost constraints of their client. Other times the experts do not actually have the qualifications to perform the analysis.
For example, I have been across from experts who use hashing to “determine” that a file was not copied because the files have different hashes. If you are familiar with hashes, changing even a single space inside a source code file will result in a completely different hash. While hashing is a great way to find exact copies, it cannot be used to make any statement about copyright infringement.
Most disturbing is when an expert makes a statement that is unquestionably false and the only reason it
could be made is that the expert is knowingly lying to support the client. In one case an expert justified scrubbing all data from all company disks (overwriting the data so that it could not be retrieved), the weekend after a subpoena was received to turn over all computer hard drives, as a normal, regular procedure at the company.
Another time an experienced programmer—the author of several programming textbooks—claimed that she could determine that trade secrets were implemented in certain source code files simply by looking at the file paths and file names. Yet another time a very experienced expert, after hours at deposition trying to explain a concept that was simply and obviously wrong, finally admitted that the lawyers had written his expert report for him.
Although I was often successful, working with the attorneys for my client, in discrediting the results of the opposing expert, there were times when the judge simply did not understand the issues well enough to differentiate the other expert’s opinions from mine.
Is there a way to ensure that experts actually know the areas about which they opine and a way to
encourage them to give honest testimony and strongly -discourage them from giving false testimony?
Following are a few ideas about this, though each one carries with it potential problems. Perhaps not all
of these ideas can definitely be implemented, but if some or all of them were adopted in the current legal system, we might have just results a higher percentage of the time. And applying these ideas to criminal cases might be a good idea, where an expert’s opinion can be the difference between life and death for a person accused of a crime.
Certain states require that experts be certified in a field of engineering before being allowed to testify
about that field in court. My understanding is that few states require certification, and it is rare in those states that an expert is actually disqualified from testifying because of lack of certification.
Perhaps if certification were required, there would be fewer “experts” who are simply looking for ways to do extra work on the side. Similarly, it might be more difficult for attorneys to find “experts” who
support their case only because they are not sophisticated enough to understand the technical issues in depth.
One important question would be who runs the certification program? There would certainly be some
competition and fighting among organizations to implement the certification. Organizations definitely exist, such as the Association for Computing Machinery (ACM) and the Institute of Electrical and Electronics Engineers (IEEE), that could set certification standards for computer scientists and electrical engineers respectively.
Other engineering groups could set standards for their own engineers. Perhaps the American Bar Association (ABA) or the American Intellectual Property Law Association (AIPLA) as well as state and federal government offices could also be involved.
A very important consideration would be under what circumstances certification could be revoked.
There would have to be a hierarchy of actions and ramifications ranging from fines to revocation. In reality, many penalties short of revocation would almost certainly result in the end of an expert’s career.
Few attorneys would want to put an expert on the stand who had a record of having been found to be
unqualified or dishonest. Also, would any behaviors lead to criminal charges against the expert? Perhaps unethical behavior in a criminal trial should carry stronger punishment, including criminal charges, than similar behavior in a civil trial.
There should be a no-tolerance policy for dishonest, unethical, or illegal behavior by an expert. At a recent conference on digital forensics, a professor gave an example of a student who cheated on a test.
The professor discovered the cheating and confronted the student. The student was sufficiently remorseful, according to the teacher (in my experience most criminals are remorseful once they are caught), and so the professor gave the student a second chance. This was simply a wrong decision.
Remember that digital forensics is the study of sophisticated ways to hack into systems, so this professor
could very well be training a criminal. Unfortunately, only about half of the faculty members at the conference agreed with me, and not all of the colleges had official policies regarding cheating. For sure, all forensics education programs must have zero-tolerance policies, in writing, and any certification
program must, too.One issue that is sure to arise is what to do if no certified expert in a particular field is available to work on the case. Perhaps the technology is very new or specialized. Or perhaps all of the certified experts are conflicted out or simply have no time. It seems that a judge could create an exception, allowing someone with experience in the field to testify in cases where certified experts are not available.
Many experts themselves resist certification requirements because they are already earning a living that they would not want to interrupt in order to study for and take a test that they feel is unnecessary. I also used to think the certification was unnecessary, but having seen the shoddy or unethical work of some experts, I am changing my mind. The government requires that a lawyer pass a bar exam before practicing law, yet experts require no similar test despite their importance to the legal process.
Another way of dealing with this problem is to require neutral experts who are contracted either by the
court or jointly by the parties in the case and whose costs are shared by both parties. Currently, there are typically two situations when neutral experts are used. One situation is when the judge decides that the issues involved are too complex for the judge or the jury to understand without an expert in the field
to explain them, and a neutral expert can cut through any biases that the experts hired by the parties may have.
Another situation is when the parties agree on an expert and jointly cover the expert’s fees. Hiring only one expert saves time and money in coming to a resolution, and it gives each party a limited ability to persuade the expert. Perhaps neutral experts should be required for every case. The parties could split the cost, or the loser could be required to pay. This seems to be a good solution, particularly if the neutral expert has been certified in her area of expertise. One drawback of having a neutral expert that should be
considered carefully is that a biased expert, or one whose skills are less than ideal, could draw an incorrect conclusion, and there would be little ability for a party to challenge it on technical grounds.
Of course, having a neutral expert does not preclude the possibility that each party could additionally employ its own expert, though this might further obscure the issues rather than clarify them, given that there could potentially be three different opinions.
Testing of Tools and Techniques
It also seems that tools and techniques used by experts should be tested and certified by an official
body. There have been instances of experts using the wrong tools, either accidentally because t