INDEX
    Explanations

    the word "English" specifically

    references to the English language

    New Auto-Interp
    Negative Logits
    Sensor
    -0.71
    ront
    -0.68
     mounts
    -0.68
    orsi
    -0.66
     incent
    -0.66
    @#
    -0.66
     [&
    -0.64
     Romo
    -0.64
    raz
    -0.63
    kos
    -0.63
    POSITIVE LOGITS
     English
    3.63
    English
    2.98
     english
    2.84
    english
    1.93
     Spanish
    1.88
     Arabic
    1.84
     Portuguese
    1.77
     Hindi
    1.74
     French
    1.72
     Welsh
    1.69
    Act Density 0.022%

    No Known Activations