INDEX
Explanations
words related to food preparation or cooking
New Auto-Interp
Negative Logits
ika
-0.17
serrat
-0.17
kenin
-0.15
trick
-0.15
isons
-0.15
ĥn
-0.15
zers
-0.15
ãĥ¼ãĤ¹ãĥĪ
-0.14
ikan
-0.14
ijken
-0.14
POSITIVE LOGITS
abble
0.33
appy
0.31
uffy
0.31
ubs
0.31
uples
0.29
apping
0.28
unch
0.28
umpt
0.28
anton
0.27
abb
0.25
Activations Density 0.008%