INDEX
Explanations
phrases indicating personal opinion or judgment
expressions of personal opinion or perspective
New Auto-Interp
Negative Logits
Uriel
-0.77
trump
-0.71
ress
-0.69
yip
-0.68
irds
-0.67
letes
-0.66
onto
-0.65
orthy
-0.65
oline
-0.65
lete
-0.65
POSITIVE LOGITS
estimation
1.34
opinion
1.21
absence
1.10
case
1.08
sense
1.07
midst
1.05
view
1.04
dealings
0.98
haste
0.98
eyes
0.98
Activations Density 0.083%