INDEX
Explanations
phrases related to answers or responses, particularly with emphasis
statements regarding answers to yes/no questions
New Auto-Interp
Negative Logits
models
-0.68
Person
-0.67
astical
-0.67
Franch
-0.66
ventures
-0.66
usp
-0.66
vre
-0.66
sers
-0.65
jan
-0.64
rongh
-0.63
POSITIVE LOGITS
yes
1.32
YES
1.17
unequiv
1.12
affirmative
1.04
simple
1.00
emph
0.98
obvious
0.97
YES
0.95
straightforward
0.89
nil
0.87
Activations Density 0.107%