INDEX
Explanations
phrases related to expressing personal opinions
phrases emphasizing differing perspectives or points of view
New Auto-Interp
Negative Logits
apons
-0.81
fml
-0.72
Cosponsors
-0.66
venants
-0.65
reddits
-0.64
bryce
-0.64
UF
-0.61
âĨ
-0.61
20439
-0.61
é¾
-0.61
POSITIVE LOGITS
contention
0.73
differentiation
0.71
gered
0.69
overlap
0.64
Distance
0.62
rupture
0.61
fame
0.61
inertia
0.60
accuser
0.59
tains
0.59
Activations Density 0.068%