INDEX
Explanations
the prefix "un-" indicating negation or the absence of a quality
New Auto-Interp
Negative Logits
Theſe
-1.30
ſche
-1.22
MainAxisSize
-1.20
myſelf
-1.19
itſelf
-1.11
ainfi
-1.11
houſe
-1.09
becauſe
-1.05
―――――
-1.04
Majefty
-1.03
POSITIVE LOGITS
un
1.58
Un
1.53
Un
1.27
UN
1.18
un
1.13
UN
0.97
pre
0.94
Pre
0.94
有不
0.87
G
0.83
Activations Density 0.055%