INDEX
Explanations
negative language related to criticism
negative evaluations of situations or concepts
New Auto-Interp
Negative Logits
)."
-1.09
.")
-0.86
]."
-0.81
.""
-0.76
").
-0.71
.'"
-0.69
)</
-0.69
'."
-0.63
netflix
-0.62
.).
-0.61
POSITIVE LOGITS
ometimes
0.50
ggles
0.46
issance
0.43
Picture
0.41
':
0.40
GEAR
0.40
aples
0.40
arching
0.40
iquette
0.39
owship
0.39
Activations Density 2.490%