INDEX
Explanations
phrases conveying disapproval or rebuttal
words related to disapproval and critique
New Auto-Interp
Negative Logits
GOODMAN
-0.74
hemor
-0.71
Hungry
-0.68
Insect
-0.63
imagination
-0.63
imeters
-0.63
ivating
-0.63
Frie
-0.62
Alz
-0.62
intest
-0.62
POSITIVE LOGITS
ledged
0.78
stellar
0.75
essors
0.73
Dear
0.73
ials
0.71
ylon
0.67
gate
0.67
zzle
0.66
thereof
0.66
ously
0.66
Activations Density 0.140%