INDEX
Explanations
words or phrases indicating doubt or challenge
instances of the word "question" and its variations
New Auto-Interp
Negative Logits
rites
-0.86
oiler
-0.74
osponsors
-0.70
\\\\
-0.68
emetery
-0.67
ategory
-0.67
alions
-0.66
teasp
-0.66
ëĭ
-0.66
Torrent
-0.66
POSITIVE LOGITS
whether
1.08
naires
1.06
why
0.98
motives
0.94
whether
0.84
ibly
0.83
ably
0.80
ingly
0.77
naire
0.77
how
0.77
Activations Density 0.067%