INDEX
Explanations
phrases indicating agreement or alignment with a specific idea or plan
phrases indicating disagreement or division among groups
New Auto-Interp
Negative Logits
emergencies
-0.69
harvested
-0.63
ess
-0.61
datas
-0.61
lookup
-0.61
Resources
-0.59
ãĥĺ
-0.59
stress
-0.59
dinners
-0.58
Detail
-0.58
POSITIVE LOGITS
disagree
0.83
dismissing
0.83
disappro
0.83
recommending
0.82
essim
0.81
Rate
0.80
disagrees
0.80
approving
0.79
endorsing
0.79
agree
0.78
Activations Density 0.434%