INDEX
Explanations
mathematical expressions and comparisons
New Auto-Interp
Negative Logits
phazard
-0.45
tambahan
-0.43
<?>>
-0.42
+#+#
-0.42
Pim
-0.40
zonder
-0.40
nhs
-0.40
Zon
-0.38
Ruh
-0.36
cession
-0.36
POSITIVE LOGITS
None
1.34
none
1.30
None
1.23
none
1.11
ninguno
1.03
NONE
1.03
neither
0.96
neither
0.90
Neither
0.89
Neither
0.87
Activations Density 0.423%