INDEX
Explanations
phrases related to strong and emphatic statements, often of disapproval
instances of the word "outright" and its context related to strong assertions or definitive statements
New Auto-Interp
Negative Logits
agine
-0.89
arts
-0.80
ulton
-0.77
ĺħ
-0.76
nan
-0.74
ĸļ
-0.72
anwhile
-0.72
gerald
-0.70
utes
-0.69
wisely
-0.69
POSITIVE LOGITS
hostility
0.84
racism
0.77
contradiction
0.72
refusal
0.72
malice
0.71
disregard
0.70
itarian
0.69
theft
0.69
ban
0.68
guiActiveUn
0.67
Activations Density 0.026%