INDEX
Explanations
This neuron activates on the word “ice” when it appears in the context of “ice hockey.”
New Auto-Interp
Negative Logits
`/
-0.07
.NVarChar
-0.06
\/
-0.06
集团
-0.06
kud
-0.06
hav
-0.06
brisk
-0.06
endowed
-0.06
xec
-0.06
Vulcan
-0.06
POSITIVE LOGITS
感じ
0.07
notated
0.07
Snow
0.06
disclosures
0.06
ewire
0.06
Podesta
0.06
κολ
0.06
attracting
0.06
mn
0.06
snow
0.06
Activations Density 0.002%