INDEX
Explanations
ambiguous or uncertain language
expressions of uncertainty or lack of clarity
New Auto-Interp
Negative Logits
azar
-0.79
arms
-0.77
rather
-0.76
zbollah
-0.74
ACTIONS
-0.71
paio
-0.67
MAY
-0.65
oubted
-0.65
YES
-0.65
gdala
-0.63
POSITIVE LOGITS
anymore
1.66
nor
1.30
yet
1.03
yet
0.96
necessarily
0.92
anything
0.88
anybody
0.86
whatsoever
0.86
any
0.83
anywhere
0.82
Activations Density 0.320%