INDEX
Explanations
This neuron detects mentions of awards and their recipients (award-related announcement language).
New Auto-Interp
Negative Logits
ımlı
-0.07
út
-0.06
greedy
-0.06
Putin
-0.06
colonial
-0.06
ceeded
-0.06
Mein
-0.06
україн
-0.06
..."↵↵
-0.06
Bluetooth
-0.06
POSITIVE LOGITS
thetic
0.07
σχ
0.07
veled
0.06
hải
0.06
ΑΓ
0.06
CustomAttributes
0.06
ethical
0.06
\Storage
0.06
ableView
0.06
marshaller
0.06
Activations Density 0.018%