INDEX
Explanations
numerical values and percentages related to various contexts
New Auto-Interp
Negative Logits
ordion
-0.14
zon
-0.14
orris
-0.14
анÑĤи
-0.14
iple
-0.14
abase
-0.13
orth
-0.13
WN
-0.13
ëijĺ
-0.13
amarin
-0.13
POSITIVE LOGITS
non
0.28
ones
0.20
mixed
0.20
non
0.19
mixed
0.19
Non
0.19
semi
0.19
others
0.19
NON
0.18
others
0.17
Activations Density 0.431%