INDEX
Explanations
mentions of the word "icing"
references to icing and baked goods
New Auto-Interp
Negative Logits
ership
-0.94
ities
-0.79
rahim
-0.74
efer
-0.71
Dim
-0.70
ismo
-0.69
smith
-0.69
geon
-0.68
isha
-0.68
ggie
-0.66
POSITIVE LOGITS
icing
1.23
cream
0.85
cream
0.83
urers
0.80
hemy
0.75
beet
0.71
umn
0.70
clot
0.70
hematically
0.69
straw
0.69
Activations Density 0.024%