INDEX
Explanations
negations or instances of the prefix "de-"
New Auto-Interp
Negative Logits
Theſe
-1.18
WillAppear
-1.01
BufferException
-0.98
iſt
-0.96
Anſ
-0.94
Beſ
-0.93
ainfi
-0.89
+#+#
-0.88
myſelf
-0.87
ScopeManager
-0.87
POSITIVE LOGITS
de
2.72
De
2.28
De
2.22
de
1.95
DE
1.87
DE
1.62
де
1.58
デ
1.20
Де
1.04
Де
1.04
Activations Density 0.082%