INDEX
Explanations
conditional situations
The main thing this neuron does is detect references to drugs (e.g. mentions of “drogas”).
New Auto-Interp
Negative Logits
624
-0.07
29
-0.07
037
-0.07
sands
-0.06
gode
-0.06
Trap
-0.06
extensions
-0.06
Forbidden
-0.06
Trading
-0.06
önc
-0.06
POSITIVE LOGITS
(interval
0.07
suyu
0.06
ferred
0.06
�인
0.06
urg
0.06
antom
0.06
swiper
0.06
زیرا
0.06
غط
0.06
{};↵↵0.06
Activations Density 0.053%