INDEX
Explanations
phrases indicating a duration of time or continuity
New Auto-Interp
Negative Logits
steel
-0.14
upon
-0.14
elix
-0.14
anh
-0.14
Hang
-0.14
roid
-0.13
ANSW
-0.13
éric
-0.13
den
-0.13
available
-0.13
POSITIVE LOGITS
ancellor
0.16
era
0.16
ago
0.16
mî
0.16
enek
0.15
verty
0.15
Jeg
0.15
olon
0.14
ازÛĮ
0.14
oader
0.14
Activations Density 0.013%