INDEX
    Explanations

    mentions of languages and geographic locations

    New Auto-Interp
    Negative Logits
    èĦ
    -0.15
    ogn
    -0.14
    ess
    -0.14
    pyx
    -0.13
    å¼ķãģį
    -0.13
     ins
    -0.13
    ãĥģãĥ¥
    -0.13
    _palette
    -0.13
    udd
    -0.13
     Cup
    -0.13
    POSITIVE LOGITS
    hoo
    0.19
    WithMany
    0.15
    ì¸ł
    0.15
    deaux
    0.15
    Editable
    0.14
    Writable
    0.14
    誤
    0.14
    ahl
    0.14
     اÙĦتÙĪ
    0.14
     Feinstein
    0.14
    Act Density 0.175%

    No Known Activations