INDEX
Negative Logits
includes
0.94
ensures
0.93
excludes
0.85
include
0.84
suggests
0.83
incluye
0.82
해당
0.81
considers
0.79
refiere
0.79
uses
0.79
POSITIVE LOGITS
monstros
1.10
craziness
1.09
stupid
1.07
stuff
0.96
thing
0.94
guy
0.93
crazy
0.92
confounded
0.91
stupidity
0.91
家伙
0.90
Activations Density 0.075%