INDEX
Explanations
sequence
This neuron primarily activates on the token “sequence” (as used in the repeated “probability of sequence …” phrases).
New Auto-Interp
Negative Logits
tn
-0.07
parentNode
-0.07
wildly
-0.06
ActiveRecord
-0.06
托
-0.06
ูก
-0.06
LOSS
-0.06
tokenize
-0.06
оты
-0.06
onions
-0.06
POSITIVE LOGITS
-General
0.07
Action
0.07
feit
0.07
dtype
0.06
.makedirs
0.06
_context
0.06
šem
0.06
Health
0.06
.netty
0.06
시에
0.06
Activations Density 0.001%