INDEX
Explanations
content related to pasta
New Auto-Interp
Negative Logits
erer
-0.17
£
-0.16
ivery
-0.16
erken
-0.14
anc
-0.14
nze
-0.14
меÑģÑĤ
-0.14
οκ
-0.14
iverz
-0.14
erif
-0.14
POSITIVE LOGITS
illes
0.28
oral
0.25
ries
0.24
eur
0.24
afari
0.24
ille
0.23
ebin
0.22
ır
0.21
tense
0.21
el
0.20
Activations Density 0.008%