INDEX
Explanations
proper nouns or specific names
initial tokens
New Auto-Interp
Negative Logits
Infór
-0.67
urlpatterns
-0.56
boneca
-0.55
Económica
-0.55
anún
-0.54
Budaya
-0.50
feroit
-0.50
MessageOf
-0.50
WriteBarrier
-0.50
bēr
-0.50
POSITIVE LOGITS
Datuak
0.59
,-
0.46
:"-"`
0.46
-(-
0.43
$-\
0.43
==-
0.43
0.42
--;
0.42
-'
0.41
__;
0.41
Activations Density 0.088%