INDEX
Explanations
verbs and phrases indicating intentions or actions related to future events
New Auto-Interp
Negative Logits
wick
-0.17
par
-0.14
ink
-0.14
raquo
-0.14
aul
-0.14
pet
-0.13
lej
-0.13
plor
-0.13
ders
-0.13
alone
-0.13
POSITIVE LOGITS
ishi
0.17
linger
0.16
cil
0.16
ijing
0.16
inyin
0.16
ettle
0.16
uela
0.15
orch
0.14
lient
0.14
get
0.14
Activations Density 0.065%