INDEX
Negative Logits
ĸļ
-0.61
²¾
-0.60
alid
-0.59
Delicious
-0.57
Tall
-0.57
NOT
-0.56
fed
-0.56
NEXT
-0.55
Daughter
-0.55
Ri
-0.55
POSITIVE LOGITS
ones
0.82
hetti
0.79
urers
0.72
teness
0.69
oor
0.68
skeptics
0.66
ardy
0.66
oba
0.66
asio
0.65
urgy
0.65
Activations Density 0.066%