INDEX
Explanations
sentences ending with a strong or emphatic statement
instances of the word "no" and associated negation contexts
New Auto-Interp
Negative Logits
gypt
-0.70
ounded
-0.68
mud
-0.66
ounds
-0.65
abal
-0.64
tro
-0.62
oufl
-0.61
arthy
-0.61
MpServer
-0.61
etitive
-0.61
POSITIVE LOGITS
sir
0.97
nor
0.75
thank
0.71
æĪ
0.71
onsense
0.69
kidding
0.67
nor
0.67
whatsoever
0.65
Shift
0.64
dear
0.64
Activations Density 0.055%