INDEX
Explanations
terms related to resource extraction and production processes
New Auto-Interp
Negative Logits
mür
-0.17
dete
-0.14
lý
-0.14
ẩu
-0.14
siz
-0.13
zakáz
-0.13
ÅĻev
-0.13
дап
-0.12
nederland
-0.12
ìĽIJìĿ´
-0.12
POSITIVE LOGITS
can
0.13
Âłs
0.13
Âłin
0.12
re
0.12
Âłp
0.12
dev
0.12
of
0.12
con
0.12
to
0.12
Âłd
0.12
Activations Density 0.051%