INDEX
Explanations
information regarding political events or statements
New Auto-Interp
Negative Logits
LLOW
-0.67
LOCK
-0.56
acea
-0.56
encyclopedia
-0.56
Champ
-0.55
WAR
-0.55
boy
-0.55
MC
-0.55
MI
-0.55
Bone
-0.55
POSITIVE LOGITS
uddenly
1.11
suddenly
0.93
alas
0.82
reversed
0.80
morphed
0.79
abruptly
0.77
mysteriously
0.77
instead
0.74
switched
0.73
anwhile
0.73
Activations Density 0.367%