INDEX
Explanations
references to film and television projects
New Auto-Interp
Negative Logits
reflex
-0.88
persuasion
-0.83
equival
-0.81
occasional
-0.81
agon
-0.80
unexplained
-0.79
habit
-0.77
torture
-0.77
bias
-0.77
naive
-0.77
POSITIVE LOGITS
Tickets
1.60
Meanwhile
1.48
RELATED
1.47
Among
1.47
Also
1.46
According
1.40
Additionally
1.38
Along
1.34
Both
1.33
Speaking
1.32
Activations Density 0.420%