INDEX
Explanations
phrases or words related to something notorious or infamous
terms related to individuals or entities described as notorious or infamous
New Auto-Interp
Negative Logits
ynthesis
-0.72
PT
-0.72
avers
-0.71
vet
-0.70
adra
-0.70
strings
-0.69
claimer
-0.69
dh
-0.68
cos
-0.67
joice
-0.67
POSITIVE LOGITS
infamous
0.81
notorious
0.79
metic
0.78
culprit
0.78
offender
0.75
ebin
0.72
dirty
0.70
scourge
0.68
dictator
0.67
misuse
0.66
Activations Density 0.017%