INDEX
Explanations
ethical and legal standards
New Auto-Interp
Negative Logits
t
1.60
1
1.56
g
1.41
А
1.05
the
1.02
चला
1.01
1
0.99
ט
0.98
ta
0.97
हरा
0.97
POSITIVE LOGITS
:
1.30
н
1.10
ن
1.02
ERVER
1.02
ン
1.02
нский
1.00
at
0.98
are
0.97
νει
0.96
Hrvatske
0.95
Activations Density 0.000%