INDEX
Negative Logits
rament
-0.74
################
-0.74
ãĥ¥
-0.74
TAIN
-0.71
GGGGGGGG
-0.71
ELY
-0.71
utherford
-0.69
RAW
-0.69
TRANS
-0.69
acular
-0.68
POSITIVE LOGITS
maker
0.93
doll
0.90
dolls
0.89
hops
0.84
imates
0.82
ongs
0.77
oru
0.76
Doll
0.75
wright
0.75
ophone
0.73
Activations Density 1.108%