INDEX
Explanations
location names
The neuron fires on occurrences of the administrative‐division term “Department” (especially in “Department of …” or “Departments of …” contexts).
New Auto-Interp
Negative Logits
(schedule
-0.07
_sim
-0.06
Mu
-0.06
arousal
-0.06
mix
-0.06
Flat
-0.06
aria
-0.06
AXIS
-0.06
references
-0.06
HOT
-0.06
POSITIVE LOGITS
اعية
0.07
etine
0.07
tarı
0.07
تحت
0.07
στις
0.06
Routes
0.06
ctx
0.06
}px
0.06
_codes
0.06
сопров
0.06
Activations Density 0.007%