INDEX
Explanations
terms related to language and grammar classifications
New Auto-Interp
Negative Logits
translation
-0.16
Translated
-0.15
translating
-0.15
translated
-0.15
orient
-0.15
translated
-0.14
translate
-0.14
translate
-0.14
orient
-0.14
actionTypes
-0.14
POSITIVE LOGITS
spoken
0.31
spoken
0.29
speakers
0.29
dialect
0.25
Speakers
0.24
dia
0.24
speaker
0.19
dia
0.19
Standard
0.19
varieties
0.19
Activations Density 0.048%