INDEX
Explanations
traditional methods or formats
New Auto-Interp
Negative Logits
of
1.02
relacion
1.02
ни
0.97
つ
0.91
ă
0.91
vět
0.88
ку
0.87
énorm
0.87
يا
0.86
し
0.86
POSITIVE LOGITS
I
1.39
ir
1.24
us
1.16
ang
1.13
traditional
1.10
Been
1.09
Traditional
1.02
aditional
1.00
k
0.98
al
0.95
Activations Density 0.017%