INDEX
Explanations
adjectives or verbs related to criticism or complacency
terms related to complaints or grievances
New Auto-Interp
Negative Logits
hyde
-0.90
bucks
-0.76
eele
-0.75
EStream
-0.74
terday
-0.74
¥µ
-0.68
HAEL
-0.67
xon
-0.66
plane
-0.65
Downloadha
-0.65
POSITIVE LOGITS
acent
1.22
ainer
1.01
icating
1.01
antly
0.94
icates
0.93
iable
0.85
iability
0.83
ause
0.83
icit
0.82
isance
0.82
Activations Density 0.006%