INDEX
    Explanations

    multiple languages

    New Auto-Interp
    Negative Logits
    _CLASSES
    -0.08
    BUT
    -0.07
     Buzz
    -0.07
     Diseases
    -0.07
     bal
    -0.07
     przede
    -0.07
    Paint
    -0.07
     SPR
    -0.07
    _REGISTER
    -0.07
    seo
    -0.07
    POSITIVE LOGITS
     cun
    0.09
    धर
    0.08
     واحد
    0.08
     franca
    0.08
     tích
    0.08
    新版
    0.08
    زن
    0.08
     verz
    0.08
     phần
    0.08
    0.07
    Act Density 0.016%

    No Known Activations