INDEX
Explanations
phrases indicating strong positive or negative opinions
phrases indicating a positive assessment or recognition of quality
New Auto-Interp
Negative Logits
pez
-0.66
hyde
-0.66
rely
-0.66
yahoo
-0.65
oleon
-0.63
ggle
-0.63
omore
-0.63
inity
-0.60
Samp
-0.60
iferation
-0.60
POSITIVE LOGITS
suited
1.10
spring
1.08
vers
1.04
behaved
1.03
enough
0.99
enough
0.97
acquainted
0.96
positioned
0.95
documented
0.94
regarded
0.94
Activations Density 0.035%