INDEX
Explanations
phrases related to political accusations or controversies
New Auto-Interp
Negative Logits
allo
-0.17
/Instruction
-0.15
μα
-0.15
pga
-0.15
ofil
-0.14
enden
-0.14
lds
-0.14
uset
-0.14
.resolve
-0.13
894
-0.13
POSITIVE LOGITS
column
0.19
columns
0.19
Slate
0.18
Salon
0.18
essays
0.17
column
0.17
columnist
0.17
Atlantic
0.17
salon
0.16
Establishment
0.16
Activations Density 0.181%