INDEX
Explanations
phrases indicating contrast or disagreement
the phrase "No" followed by varying degrees of emphasis regarding certainty or agreement
New Auto-Interp
Negative Logits
iership
-0.77
ials
-0.71
ãĤ¼ãĤ¦ãĤ¹
-0.70
RAFT
-0.66
lycer
-0.66
rex
-0.66
romy
-0.65
endish
-0.64
assies
-0.64
aven
-0.62
POSITIVE LOGITS
xious
1.05
longer
0.97
matter
0.96
etheless
0.93
kidding
0.89
doubt
0.88
zzle
0.85
indication
0.84
clue
0.83
terday
0.81
Activations Density 0.063%