INDEX
Explanations
phrases expressing certainty or conviction
references to the speaker and their perspective
New Auto-Interp
Negative Logits
urate
-0.69
licted
-0.67
pload
-0.65
abama
-0.65
eport
-0.64
shore
-0.63
cific
-0.61
selage
-0.60
pole
-0.60
rien
-0.60
POSITIVE LOGITS
SQL
0.66
¥µ
0.66
wisely
0.62
damned
0.62
AGA
0.61
senal
0.58
caveats
0.56
selves
0.56
Audio
0.56
Exception
0.56
Activations Density 0.107%