INDEX
Explanations
instances of political commentary or critique
New Auto-Interp
Negative Logits
341
-0.17
chalk
-0.15
ascade
-0.14
erli
-0.14
ials
-0.14
cliffe
-0.14
scape
-0.14
eph
-0.14
족
-0.13
opc
-0.13
POSITIVE LOGITS
cent
0.15
Bak
0.15
Cent
0.15
symbolic
0.14
ILT
0.14
اتØŃاد
0.14
reception
0.14
rea
0.14
Kore
0.13
aram
0.13
Activations Density 0.167%