INDEX
Explanations
phrases indicating historical continuity or duration
New Auto-Interp
Negative Logits
edith
-0.17
rrha
-0.17
aec
-0.17
ollapsed
-0.16
Markup
-0.16
esus
-0.16
isoft
-0.15
ÑĪев
-0.14
éli
-0.14
tae
-0.14
POSITIVE LOGITS
forever
0.20
199
0.19
197
0.18
before
0.17
198
0.17
childhood
0.17
ido
0.17
PLICATION
0.17
187
0.16
192
0.15
Activations Density 0.052%