INDEX
Explanations
instances of the word "yes" with high activation values
affirmative responses or expressions of agreement
New Auto-Interp
Negative Logits
ojure
-0.80
Merit
-0.79
lets
-0.76
bage
-0.75
Offline
-0.74
RAW
-0.72
rarily
-0.72
externalToEVAOnly
-0.71
DragonMagazine
-0.70
ributed
-0.69
POSITIVE LOGITS
terday
1.37
yes
1.06
etheless
0.88
sir
0.87
YES
0.79
matter
0.73
Yes
0.73
sure
0.72
parole
0.70
stranger
0.69
Activations Density 0.008%