INDEX
Explanations
phrases that indicate time duration or history
New Auto-Interp
Negative Logits
onse
-0.18
æ¦
-0.16
ë§ŀ
-0.15
usters
-0.14
voks
-0.14
еÑĢÑĤа
-0.14
antium
-0.14
aments
-0.14
aura
-0.13
kening
-0.13
POSITIVE LOGITS
nat
0.17
Dial
0.16
Annunci
0.15
course
0.15
eken
0.15
abra
0.14
rench
0.14
Spacer
0.14
ires
0.14
distance
0.14
Activations Density 0.033%