INDEX
Explanations
affirmative responses or confirmations
New Auto-Interp
Negative Logits
"");
-0.82
"));
-0.76
》.
-0.75
munk
-0.75
"]).
-0.74
*/}
-0.72
')]
-0.71
}]
-0.70
).)
-0.70
)});
-0.70
POSITIVE LOGITS
YES
1.29
Yes
1.22
YES
1.20
yes
1.18
Yes
1.14
yes
1.14
DIPSETTING
0.88
Yep
0.85
yep
0.83
Yea
0.81
Activations Density 0.076%