INDEX
Explanations
references to food and cooking
New Auto-Interp
Negative Logits
Baker
-0.17
meisjes
-0.16
Cake
-0.15
lobal
-0.15
Cake
-0.15
cake
-0.15
cake
-0.15
Bakery
-0.15
ulpt
-0.15
bakery
-0.14
POSITIVE LOGITS
soup
0.56
Soup
0.51
soup
0.46
sou
0.45
broth
0.44
Soup
0.44
Sou
0.39
oup
0.34
_soup
0.33
Sou
0.32
Activations Density 0.105%