INDEX
Explanations
This neuron is activated by present‐tense verbs ending in “-ing.”
New Auto-Interp
Negative Logits
a
-0.12
A
-0.11
A
-0.10
a
-0.08
ra
-0.08
à
-0.08
LA
-0.08
alpha
-0.08
RA
-0.07
ca
-0.07
POSITIVE LOGITS
ing
0.20
ING
0.17
ling
0.14
ning
0.13
ting
0.13
ing
0.13
ping
0.13
ings
0.13
ving
0.12
ding
0.12
Activations Density 1.763%