INDEX
Explanations
concepts related to bilingualism and language proficiency
New Auto-Interp
Negative Logits
translations
-0.19
Translations
-0.17
translations
-0.17
lations
-0.16
vocab
-0.16
translate
-0.15
translate
-0.15
Translate
-0.14
.openg
-0.14
upo
-0.14
POSITIVE LOGITS
Second
0.23
ELF
0.23
second
0.22
Acquisition
0.22
Heritage
0.21
second
0.21
acquisition
0.21
Second
0.21
mother
0.19
target
0.19
Activations Density 0.042%