INDEX
Explanations
The neuron activates on occurrences of the word “lookup” (and its variants like “look up” or “looks up”).
New Auto-Interp
Negative Logits
Arn
-0.07
sent
-0.07
boat
-0.07
全部
-0.07
mist
-0.07
arm
-0.07
cr
-0.06
minimize
-0.06
Band
-0.06
玲
-0.06
POSITIVE LOGITS
lookup
0.09
Lookup
0.08
Lookup
0.08
lookup
0.07
uphol
0.07
_lookup
0.07
Roth
0.07
hof
0.07
UM
0.07
�
0.07
Activations Density 0.005%