INDEX
Explanations
Adjectives and verbs describing negative emotions and actions
themes of conflict and societal challenges
New Auto-Interp
Negative Logits
illin
-0.72
Dispatch
-0.70
Entered
-0.68
Tes
-0.64
azaki
-0.64
Fla
-0.61
oga
-0.60
marrow
-0.60
Reviewed
-0.60
Vaughan
-0.57
POSITIVE LOGITS
refers
0.83
huh
0.83
encompasses
0.81
Reviewer
0.80
NEVER
0.76
ertodd
0.76
nonetheless
0.73
itself
0.73
ALWAYS
0.73
entails
0.72
Activations Density 1.058%