INDEX
Explanations
the setup animal originals rules
New Auto-Interp
Negative Logits
И
2.11
and
1.89
to
1.66
С
1.53
Я
1.52
с
1.52
volna
1.51
У
1.50
프
1.48
А
1.47
POSITIVE LOGITS
jenigen
2.08
zelfde
2.02
mselves
1.97
oretically
1.93
ia
1.83
ere
1.80
్
1.70
iv
1.68
ución
1.63
indest
1.61
Activations Density 0.518%