INDEX
Explanations
The neuron activates on occurrences of the word “country.”
New Auto-Interp
Negative Logits
exe
-0.07
lever
-0.07
8
-0.07
він
-0.07
Starter
-0.07
astle
-0.07
helm
-0.07
Mix
-0.07
.Selection
-0.07
.Test
-0.07
POSITIVE LOGITS
country
0.16
Country
0.14
countries
0.13
Country
0.12
-country
0.11
country
0.11
Countries
0.11
país
0.10
countries
0.10
_country
0.09
Activations Density 0.031%