INDEX
Explanations
numerical values and statistics
New Auto-Interp
Negative Logits
инÑĸ
-0.18
ion
-0.17
istically
-0.15
лиÑĩ
-0.15
sts
-0.15
kowski
-0.15
zek
-0.14
istic
-0.14
ivers
-0.14
ostel
-0.14
POSITIVE LOGITS
ture
0.20
ington
0.18
oola
0.16
izu
0.15
aged
0.15
aments
0.15
redi
0.14
bsites
0.14
otton
0.14
erman
0.14
Activations Density 0.111%