INDEX
Explanations
references to bread and its various forms or qualities
New Auto-Interp
Negative Logits
λή
-0.17
phies
-0.16
ERGE
-0.15
æĸ¹
-0.15
æħĭ
-0.15
ortex
-0.15
ertools
-0.15
antics
-0.14
isay
-0.14
nels
-0.14
POSITIVE LOGITS
sticks
0.33
basket
0.30
fruit
0.29
stick
0.28
winner
0.27
ths
0.24
pudding
0.23
win
0.23
crumbs
0.22
ruit
0.22
Activations Density 0.010%