INDEX
Explanations
references to specific TV shows and themes like "The Walking Dead" and "Hunger Games"
mentions of popular television shows and film franchises
New Auto-Interp
Negative Logits
quota
-0.69
orts
-0.68
oil
-0.64
oso
-0.63
acid
-0.63
prince
-0.63
heet
-0.62
heed
-0.62
employ
-0.62
quotas
-0.62
POSITIVE LOGITS
Walking
3.80
walking
1.48
Walk
1.38
Witcher
1.16
alking
1.15
Talking
1.09
Zombies
1.09
walking
1.08
Walk
1.08
Dying
1.06
Activations Density 0.012%