INDEX
Explanations
modal verbs indicating possession or experience
New Auto-Interp
Negative Logits
won
-0.16
aldi
-0.14
änger
-0.14
haven
-0.14
uzzi
-0.14
hani
-0.14
icias
-0.14
649
-0.14
uat
-0.13
elian
-0.13
POSITIVE LOGITS
long
0.31
always
0.28
always
0.26
Always
0.23
Always
0.23
ALWAYS
0.22
long
0.22
historically
0.22
tended
0.21
traditionally
0.21
Activations Density 0.197%