INDEX
Explanations
evaluative or opinion-based phrases about people or things
words related to evaluations and judgments, such as praise, criticism, and descriptions
New Auto-Interp
Negative Logits
tein
-0.63
Conversation
-0.59
rouse
-0.59
Same
-0.57
abiding
-0.57
Exit
-0.57
Wave
-0.56
acion
-0.56
mad
-0.56
reckoning
-0.55
POSITIVE LOGITS
by
0.90
unfairly
0.89
extensively
0.84
unanimously
0.79
skept
0.77
igated
0.76
controvers
0.76
repeatedly
0.75
harshly
0.73
internationally
0.73
Activations Density 0.180%