INDEX
Explanations
phrases indicating opposition or contradiction
negations or expressions of denial
New Auto-Interp
Negative Logits
ngth
-0.73
æ©
-0.68
ĸļ
-0.66
lance
-0.65
kamp
-0.63
Line
-0.63
charges
-0.62
file
-0.62
plex
-0.62
quin
-0.61
POSITIVE LOGITS
necessarily
0.88
outright
0.86
eworthy
0.84
epad
0.79
acles
0.76
exactly
0.74
icably
0.74
entirely
0.73
remotely
0.73
adequately
0.70
Activations Density 0.035%