INDEX
Explanations
languages and language-related terms
New Auto-Interp
Negative Logits
èĭ±æĸĩ
-0.17
nger
-0.15
ellas
-0.15
aneous
-0.15
gings
-0.14
itionally
-0.14
ÅĻe
-0.14
norske
-0.14
äch
-0.14
.ss
-0.14
POSITIVE LOGITS
-speaking
0.43
-language
0.39
language
0.30
-spe
0.28
-medium
0.27
language
0.25
speaking
0.25
spe
0.25
Language
0.21
anguage
0.20
Activations Density 0.065%