INDEX
Explanations
references to various types of candy
references to candy and marshmallows
New Auto-Interp
Negative Logits
isted
-0.87
ually
-0.74
inyl
-0.74
uren
-0.73
inity
-0.71
yright
-0.70
iet
-0.69
iance
-0.69
inel
-0.68
ests
-0.68
POSITIVE LOGITS
mallow
1.23
flake
0.93
moon
0.86
crow
0.83
roo
0.82
MENTS
0.80
mand
0.79
FACE
0.77
dos
0.76
ãĤ¼ãĤ¦ãĤ¹
0.75
Activations Density 0.094%