INDEX
Explanations
phrases indicating opinions or evaluations regarding achievements or qualities
New Auto-Interp
Negative Logits
rypton
-0.16
OTHERWISE
-0.14
Aires
-0.14
ScrollIndicator
-0.14
rientation
-0.14
uctions
-0.14
indhoven
-0.14
Bid
-0.14
ellido
-0.14
ctors
-0.14
POSITIVE LOGITS
critics
0.22
dư
0.22
controversy
0.21
media
0.21
widespread
0.20
debate
0.20
talk
0.20
buzz
0.20
everyone
0.20
headlines
0.19
Activations Density 0.011%