The strategies below are beginning to become less and less reliable as video generation AI improves. However, low quality AI generated videos from pre-2025 are usually fairly easy to spot if you know what to look for, and most people and their parents should be able to avoid being fooled as long as they stay vigilant. Here are some simple tips and examples.
The most common technical indicators of AI-generated or manipulated videos are temporal and spatial inconsistencies, which are small but noticeable errors that break realism. Individually these issues are not a big deal, but put enough of them together and after a few seconds something will start to feel strange...
Visual cues:
All of the problems below are related to underlying weaknesses of generative models in maintaining realistic physics, lighting, and timing across video frames.
The most obvious signs are usually when things suddenly pop up or disappear from one frame to the next. This is called temporal inconsistency, and happens most often especially when an object is obstructed from view for some reason, but fails to show back up again as expected when it should no longer be obstructed. This is similar to how babies lack object permanence, and is becomes more problematic the longer the video gets. You can imagine that the AI either does not remember beyond a certain number of frames or does not understand that those things are still relevant later on. This effect can be easy to miss if many things are moving quickly and all at once, or if the video is very short.
Illumination inconsistencies often appear when lighting on the subject doesn’t match the scene. The most obvious things to look for here are the angles and shapes of shadows. Ray tracing is a notoriously difficult computational problem, but AI seems to be slowly getting it.
Object collisions or interactions may behave unrealistically if there is an error with the depth mapping. This is more common when there is uncertainty about the positioning of objects in the frame or if there is an unusual perspective.
Other issues that were common for AI image generators such as realistic hands (often produced the wrong number of fingers) or legible text can also transfer to problems in video. However, these have largely been fixed in image generators already.
If you really try to pay attention, comparisons between the video and audio may reveal lip-sync errors and prosodic inconsistencies, where mouth movements and tone don’t align. These are harder to catch but also harder for AI to do well. The video above explains prosodic features very well.