INDEX
Explanations
phrases related to expressing and discussing personal opinions and views
references to personal opinions or perspectives
New Auto-Interp
Negative Logits
Dull
-0.75
trap
-0.67
insert
-0.64
unbeliev
-0.64
Textures
-0.63
cia
-0.63
dry
-0.63
eri
-0.62
duct
-0.62
Sequ
-0.61
POSITIVE LOGITS
stances
1.18
beliefs
1.05
stance
0.97
expressed
0.96
opinions
0.95
regarding
0.91
views
0.91
odox
0.91
viewpoints
0.91
esp
0.86
Activations Density 0.144%