INDEX
Explanations
the preposition "on" in various contexts
New Auto-Interp
Negative Logits
rom
-0.16
scale
-0.14
Spir
-0.14
ÙĬج
-0.14
clean
-0.14
bombing
-0.14
Pey
-0.14
onom
-0.14
com
-0.13
scale
-0.13
POSITIVE LOGITS
ehen
0.16
дов
0.15
itten
0.15
onda
0.15
ãĥªãĥ¼ãĤº
0.15
aisy
0.15
orda
0.15
sts
0.14
опиÑģ
0.14
éĺª
0.14
Activations Density 0.009%