INDEX
Explanations
words related to criticism and negative judgment
instances of the word "dumb" and its variations
New Auto-Interp
Negative Logits
OHN
-0.81
AUT
-0.77
cially
-0.73
Lago
-0.71
APH
-0.70
ILA
-0.69
riott
-0.68
IUM
-0.67
ournal
-0.66
UAL
-0.66
POSITIVE LOGITS
founded
1.15
arton
0.99
ness
0.98
dumb
0.95
stru
0.93
found
0.92
bell
0.92
wallet
0.91
Dumb
0.91
est
0.90
Activations Density 0.004%