INDEX
Explanations
phrases indicating the completion of events or time periods
New Auto-Interp
Negative Logits
anko
-0.15
lds
-0.14
otope
-0.14
jde
-0.14
Ëĺ
-0.14
igi
-0.14
avic
-0.14
neh
-0.14
CESS
-0.13
ettings
-0.13
POSITIVE LOGITS
istrovstvÃŃ
0.16
veis
0.16
Byl
0.15
ward
0.15
vard
0.15
uard
0.15
/top
0.14
ostel
0.14
stage
0.14
isen
0.14
Activations Density 0.031%