INDEX
Explanations
references to political and social influence related to governmental processes and actions
New Auto-Interp
Negative Logits
_________________↵↵
-0.11
.DropDown
-0.11
maktan
-0.09
/******/
-0.09
_CNTL
-0.09
Pazar
-0.09
ÙĬÙĦاد
-0.09
Böl
-0.09
¿ÃĤ
-0.09
ÏģοÏį
-0.09
POSITIVE LOGITS
â̦↵
0.09
<|end_of_text|>
0.07
ab
0.07
ci
0.07
...↵
0.07
iro
0.07
0.07
Âł
0.06
ast
0.06
famously
0.06
Activations Density 0.180%