INDEX
Negative Logits
ãĥ¥
-0.81
TAIN
-0.79
################
-0.77
EED
-0.76
ãĤ¡
-0.72
riott
-0.71
llah
-0.71
ELY
-0.71
GGGGGGGG
-0.69
NING
-0.69
POSITIVE LOGITS
doll
0.97
maker
0.95
dolls
0.88
enger
0.84
houses
0.83
ies
0.81
house
0.81
ophone
0.81
oning
0.78
omb
0.74
Activations Density 0.023%