INDEX
Explanations
prepositions
an end-of-text signal or the conclusion of the document
New Auto-Interp
Negative Logits
horizont
-0.74
derog
-0.69
misunder
-0.68
reperto
-0.67
prest
-0.65
destro
-0.65
accompan
-0.64
forth
-0.64
bet
-0.64
ucci
-0.62
POSITIVE LOGITS
course
0.84
sorts
0.76
Horror
0.69
Tradable
0.68
SHARES
0.66
course
0.64
Course
0.63
UTERS
0.63
Psy
0.61
icial
0.61
Activations Density 0.039%