INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    “I
    -0.07
    -0.07
     inspectors
    -0.06
    .She
    -0.06
     içerisinde
    -0.06
    Він
    -0.06
     LEVEL
    -0.06
     Homeland
    -0.06
     الذهاب
    -0.06
    ..
    -0.06
    POSITIVE LOGITS
    66
    0.07
    dcc
    0.07
     cooking
    0.07
    elastic
    0.06
     خویش
    0.06
    0.06
     barbar
    0.06
    ulist
    0.06
    lider
    0.06
     :'
    0.06
    Act Density 0.059%

    No Known Activations