INDEX
Explanations
verbs and their conjugations in the context of noun phrases
New Auto-Interp
Negative Logits
ner
-0.18
ser
-0.17
son
-0.16
dec
-0.16
erty
-0.15
erate
-0.15
INGER
-0.15
er
-0.15
ners
-0.15
cke
-0.15
POSITIVE LOGITS
egg
0.20
ohl
0.19
eyen
0.19
iliary
0.16
eyer
0.16
razier
0.16
afil
0.15
êµ´
0.15
akens
0.15
bach
0.15
Activations Density 0.052%