INDEX
    Explanations

    mentions of languages and translations

    New Auto-Interp
    Negative Logits
     Ireland
    -0.18
    anford
    -0.18
     australia
    -0.15
     Schneider
    -0.15
    ì¼
    -0.15
     Ire
    -0.15
     Hell
    -0.14
    ظ
    -0.14
    缣
    -0.14
    Compile
    -0.14
    POSITIVE LOGITS
     Hebrew
    0.35
     Spanish
    0.35
     Arabic
    0.34
     Portuguese
    0.33
     Mandarin
    0.32
     French
    0.31
     Russian
    0.29
    Spanish
    0.29
     Hindi
    0.29
     Gujar
    0.28
    Act Density 0.193%

    No Known Activations