INDEX
Explanations
mentions of specific people and events related to political figures and actions
New Auto-Interp
Negative Logits
ÑĪка
-0.07
ÑĤакими
-0.06
áy
-0.06
Dam
-0.06
alore
-0.06
apore
-0.06
ntl
-0.06
ÙħØ«ÙĦا
-0.06
ologne
-0.06
bon
-0.06
POSITIVE LOGITS
assis
0.08
onDataChange
0.07
numberWith
0.06
allon
0.06
empt
0.06
sse
0.06
vos
0.06
***!↵
0.06
undi
0.06
ahlen
0.06
Activations Density 0.001%