Trade article

The benefits and risks of artificial intelligence as legal evidence

The term Artificial Intelligence can invoke fears of a dystopian robotic world devoid of human restraint and, while its most nefarious applications receive the greatest attention, AI routinely improves the quality of our daily lives.

AI helps train surgeons and business professionals by creating visuals that never existed in the real world. But AI is also being used by governments for facial recognition, and solving crimes, with potential questions of ethics and privacy.

In this article, we will examine what AI can do for media forensics. Its impact and limitations, and the current round-up of industry accepted AI software applications. Let’s first understand how media AI works.

A person can recognize the letter “A”, even when it is poorly handwritten, or distorted in a CAPTCHA website challenge. This is because our brain has enough prior experience to recognize a familiar item in an unfamiliar configuration. Similarly, AI uses its library of training data to know how to process unknown data. Since this process mimics the human brain, it is referred to as a neural network. Without AI, software can only measure the length and relative relationship of the lines that the “A” is made of, and hope that they match the expected pattern. AI is just the natural progression of computer learning.

Let’s take the example of removing noise from a video, image or audio file. This is a destructive process, because it is impossible to remove noise without leaving voids in the data where that noise had existed. Adjacent data might be unintentionally affected in this process, and valid data can be accidentally removed when mistaken for noise. Software must then decide how to fill the voids left by the removed noise, usually by making estimations using the remaining adjacent data. While logical, this process is inserting new data.

Traditionally, the user of enhancement software selects which denoising algorithm is to be applied, and at what strength. The quality of the end results becomes dependent upon the experience of the user, and the possible influence of their subconscious bias. AI removes these issues by having the software dynamically make decisions based solely upon the contents being processed. AI can also make far more logical assumptions regarding the scene data that should have existed in lieu of the current noise data. The courts have begun to weigh the benefits of AI, against the risks of judgement-based enhancements.

There are numerous forensic applications that provide photo and video enhancement (denoising, lighting, focus, sharpness, etc..), and audio enhancement (denoise, volume, fidelity, etc…) and those applications range in cost from free to budget busters, with each offering unique strengths and weaknesses. I wrote an earlier article in eForensics Magazine that compared those forensic applications, so I will only focus on forensic AI applications.

When it comes to forensic audio AI, iZotope is the king with their RX Advanced offering. The software sells for around $1,000 and includes a broad range of AI and non-AI audio filters that can salvage seemingly hopeless recordings. Of course, you can try before you buy. With no significant AI competitor, this is a must have application for audio forensics.

Photographic AI is a subset of video AI, and every video AI company also provides a photo AI solution. Because of the massive data present within a video, as compared to the smaller data density of audio or a photograph, it is video AI that provides the most impressive, and sometimes frustrating, forensic results.

A surveillance video can have multiple concurrent issues (video noise, extreme darkness, motion blur, incorrect focus, etc…), and the weight of each issue can fluctuate across the videos’ duration. In such circumstances, AI will attempt to simultaneously address multiple issues to produce results that are clearer than what could be possible using traditional enhancement methods. Since AI is automated and requires less user input, it is also much easier for a novice to use.

This is not to imply that AI is a magic bullet. Sometimes the cumulative visual issues can be so severe that the AI algorithm selects the wrong solution, or over processes the correct model, resulting in a grossly inaccurate outcome.

Real world experience has shown that AI generally performs best when applied after traditional enhancements have improved the scene lighting and sharpness. If this workflow applies to your results, then the pre-AI results should be used to lay the forensic foundation for your AI results. AI is a powerful tool, not a singular solution, and AI results should never be accepted as fact without a clear justification for its use. Here are the leading forensic AI video applications.

Topaz Video Enhance AI
Although not designed for the forensic community, Topaz Video Enhance AI is a powerful tool, especially with videos that have decent lighting and minimal visual noise. You can expect significant clarity improvements to both geometric (e.g. vehicles, license plates, logos, text) and non-geometric (e.g. faces, distorted reflections) shapes and objects. The software also removes stubborn camera noise, and up-sizes resolution.

The user interface is simplistic, with automatic settings suggestions and a fast preview feature. Topaz Video Enhance AI has excellent multi-core and GPU optimization, costs a modest $199 for a lifetime license with upgrades, and includes a try before you buy evaluation. But there are some limitations.

If you want your export results to be lossless (in accordance with forensic best practices), then your only option is to save your video as a series of still images. You then must use some other application to reconstitute your video. While feature requests have been made to add an uncompressed video export option, this has yet to be addressed.

AVC Labs Video Enhancer
What sets AVC apart is its multi-frame processing mode, which can produce stunning video clarity and noise reduction. Extensive testing has proven that geometrically shaped objects benefit the most. Furthermore, low light noise overlays are eliminated without requiring the use of more destructive traditional denoisers. AVC's multi-frame mode provides unsurpassed clarity of geometric shapes and objects. However, non-geometric objects can undergo a cartoonish level of distortion as the algorithm tries to make sense of the scene.

The software has excellent multi-core and GPU optimization, and the end results can be saved lossless as either a video or a series of stills. AVC provides responsive support and feature requests geared toward their forensic clients. Since forensics is a smaller user group, the fee is a steeper $499 for a lifetime license with upgrades. There is a try before you buy evaluation, and short term leasing options, but leasing can be more expensive in the long term. AVC’s multi-frame feature is a must-have Hail-Mary application of last resort.

For example, I had a case (“Nathanial Ian Dickinson Wild v. Marlene Gomez”. Los Angeles Superior Court case #21PSRO01751) in February of 2022 where the case hinged on whether the depicted female was holding a mobile phone or a gun. I had tried all of my traditional enhancement tools, but none could provide the required clarity, and the primary question remained open to interpretation. I then turned to Topaz AI, which did get slightly closer, but it was the multi-frame mode of AVC’s AI that definitively answered this case critical question. The object could be clearly seen as a phone, even though the non-geometric objects (e.g. her face) became distorted. My AI results were accepted by the courts, and the case settled favorably. Several of my peers have shared similar success stories, which is leading to the broader courtroom acceptance of AI.

If your budget can support the expense, I strongly recommend purchasing AI offerings from both Topaz and AVC. Be sure to have a decent graphics card in your computer as AI processing is extremely math intensive and can require hours, especially in multi-frame mode.

I can’t comment upon the amazing enhancement capabilities of AI without also mentioning its nefarious use in creating Deepfakes, realistic videos that depict something that never occurred. While common in Hollywood movies, the newsworthy Deepfakes are AI fabrications of a person saying or doing something, usually for political or revenge purposes. Let’s look at how this occurs.

Excerpts from existing recordings of the victim are fed into the AI software, which then isolates the target head for each utterance, and then aligns the head horizontally, fixed and forward facing. Then the AI stitches together moments of mouth movement to compose the desired final recording. During subsequent iterations, transitions between mouth movements, eye blinking, breathing, and subtle head motions are adjusted to synchronized to improve the overall appearance. It is not uncommon for the resulting Deepfake to appear more credible than the real recordings. The state of New Jersey wrote an in-depth article on Deepfakes.

Creating a deepfake requires some basic skills and time, because commercial software is not yet available due to concerns over copyright or tort violation. Eventually, some hacker may anonymously release an automated application just to watch the resulting havoc it will cause.

I have made a career out of detecting Deepfakes for the major news services, and this has become an endless game of whack-a-mole. Each time our industry derives a new method to detect Deepfakes, that knowledge inevitably becomes a testing tool used to improve the effectiveness of Deepfake software.

A less nefarious application of Deepfake technology is to convert a jerky one-frame per second video into a fluent thirty-frame per second video, by synthesizing the missing interim moments. However, this creates the risk of inventing events that never occurred (e.g. when a gun was fired). AI has also begun to appear in the court room in the form of incident recreations, and converting 2-D evidence into 3-D projections.

Soon, AI may allow anyone to become a content creator by simply dictating a story, and then letting the software build the visuals. Users may be able to take a mobile phone video, highlight an unidentifiable face or license plate, and receive the desired clarity moments later. This future should seem logical for those who have tried Goggle’s visual search tool (previously known as Google “Lens”).

Audio AI is beginning to catch up with its visual counterpart. You may have experienced this when you last spoke with your bank’s virtual assistant. It can seem magical that the computer can understand your questions, even when your voice is accompanied by distracting background sounds (e.g. a TV or radio, other people, city sounds, etc…). Soon, AI audio applications may provide one-click solutions to remedy nearly any audio issue.

Today’s media AI solutions were science fiction just a decade ago. In another decade, real-time AI processing may become a standard feature of our smart-phones. I don’t know about you, but I can’t wait for tomorrow.