INDEX
Explanations
statements or descriptions with a strong or impactful tone
the term "bold" used to emphasize statements, policies, or actions
New Auto-Interp
Negative Logits
Cheong
-0.94
OTOS
-0.84
yip
-0.73
nesota
-0.67
enfranch
-0.67
DISTRICT
-0.67
apolis
-0.65
rogens
-0.64
uters
-0.63
aito
-0.62
POSITIVE LOGITS
faced
1.06
ness
1.03
er
0.98
face
0.97
bold
0.95
bold
0.93
word
0.84
est
0.82
Ital
0.79
nesses
0.79
Activations Density 0.026%