INDEX
Explanations
statements of what someone thinks or believes
speculations or hypothetical statements about people's beliefs or actions
New Auto-Interp
Negative Logits
Orange
-0.64
pires
-0.62
Kag
-0.62
è¦ļéĨĴ
-0.61
SIG
-0.61
{*-0.60
Xuan
-0.60
underscores
-0.60
Case
-0.60
Scotia
-0.57
POSITIVE LOGITS
prefer
1.16
rather
1.15
gladly
1.14
nt
1.12
NEVER
1.00
rather
0.97
've
0.95
want
0.93
never
0.93
never
0.92
Activations Density 0.136%