INDEX
Explanations
government
This neuron detects mentions of government or intelligence actors (e.g., governments, agencies, spies).
New Auto-Interp
Negative Logits
ěst
-0.06
_splits
-0.06
ôle
-0.06
ho
-0.06
ména
-0.06
Perl
-0.06
ač
-0.06
ât
-0.06
Bite
-0.06
<Input
-0.06
POSITIVE LOGITS
.active
0.07
Canberra
0.07
minOccurs
0.07
_enter
0.06
Kushner
0.06
.des
0.06
((↵
0.06
без
0.06
halluc
0.06
нер
0.06
Activations Density 0.011%