INDEX
Explanations
phrases related to political discourse and leadership accountability
New Auto-Interp
Negative Logits
ado
-0.06
*
-0.06
dint
-0.06
istrovstvÃŃ
-0.06
">//
-0.06
pic
-0.06
ange
-0.06
226
-0.06
util
-0.06
alongside
-0.06
POSITIVE LOGITS
dialogs
0.07
awe
0.07
ÑĤап
0.07
lobs
0.07
çĺ
0.06
ç°
0.06
Îī
0.06
anyak
0.06
etailed
0.06
:)↵
0.06
Activations Density 0.001%