INDEX
Explanations
The neuron fires on occurrences of the word “look” (and its forms “looks”).
New Auto-Interp
Negative Logits
_pin
-0.07
λί
-0.06
denně
-0.06
málo
-0.06
ById
-0.06
ân
-0.06
ĩnh
-0.06
cole
-0.06
upa
-0.06
yanında
-0.06
POSITIVE LOGITS
look
0.08
реак
0.07
locking
0.07
remarked
0.07
Looks
0.07
record
0.07
hacking
0.06
hepatitis
0.06
OCC
0.06
.Dec
0.06
Activations Density 0.010%