INDEX
Explanations
phrases mentioning contradictory information or beliefs
the word "contrary" and its variations, indicating a focus on presenting opposing viewpoints or rebutting popular beliefs
New Auto-Interp
Negative Logits
aquin
-0.78
urated
-0.78
ahime
-0.77
ilitating
-0.75
arnaev
-0.71
beans
-0.70
afer
-0.70
artney
-0.69
hens
-0.69
istan
-0.69
POSITIVE LOGITS
etheless
0.91
minded
0.85
lihood
0.79
ly
0.74
notwithstanding
0.72
chronological
0.72
contrary
0.71
lly
0.71
guiActiveUn
0.71
ptions
0.70
Activations Density 0.012%