INDEX
Explanations
statements and beliefs indicating a sense of uncertainty or questioning
New Auto-Interp
Negative Logits
Roskov
-0.91
Monfieur
-0.83
Efq
-0.82
صوتيه
-0.80
Shakspeare
-0.80
UnusedPrivate
-0.78
ſelf
-0.78
myſelf
-0.75
Majefty
-0.72
Phry
-0.71
POSITIVE LOGITS
But
0.88
but
0.68
但她
0.67
it
0.66
nevertheless
0.66
nonetheless
0.63
yet
0.62
But
0.62
("")]
0.62
Her
0.61
Activations Density 0.296%