INDEX
    Explanations

    terms related to language and grammar classifications

    New Auto-Interp
    Negative Logits
     translation
    -0.16
    Translated
    -0.15
     translating
    -0.15
     translated
    -0.15
     orient
    -0.15
    translated
    -0.14
     translate
    -0.14
    translate
    -0.14
    orient
    -0.14
     actionTypes
    -0.14
    POSITIVE LOGITS
     spoken
    0.31
    spoken
    0.29
     speakers
    0.29
     dialect
    0.25
     Speakers
    0.24
    dia
    0.24
     speaker
    0.19
     dia
    0.19
     Standard
    0.19
     varieties
    0.19
    Act Density 0.048%

    No Known Activations