INDEX
Explanations
The neuron activates on occurrences of “Greek” or “Greece,” i.e. references to that country or its demonym.
New Auto-Interp
Negative Logits
Jub
-0.07
Transport
-0.07
malé
-0.07
ashboard
-0.07
discharge
-0.07
NB
-0.06
fad
-0.06
APR
-0.06
ames
-0.06
harb
-0.06
POSITIVE LOGITS
Greek
0.15
Greek
0.15
Greece
0.14
Athens
0.11
Greeks
0.10
gods
0.08
ελλην
0.08
Brighton
0.07
reek
0.07
Chinese
0.07
Activations Density 0.006%