INDEX
Explanations
mentions of pancakes and syrup
references to pancakes and syrup
New Auto-Interp
Negative Logits
ment
-0.86
stra
-0.80
ments
-0.79
mint
-0.78
nuts
-0.77
Trader
-0.76
nut
-0.71
culosis
-0.70
chin
-0.69
ious
-0.69
POSITIVE LOGITS
syrup
0.93
pancakes
0.88
ĸļ
0.83
ipedia
0.76
aye
0.75
XT
0.75
panc
0.71
akeru
0.70
rase
0.70
bsite
0.67
Activations Density 0.040%