INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     polished
    -0.06
     عشر
    -0.06
     cols
    -0.06
    -0.06
     Ich
    -0.06
    есь
    -0.06
    istance
    -0.06
    -0.06
     duplication
    -0.06
    POSITIVE LOGITS
     Athe
    0.07
     overwhelmingly
    0.06
    ('-',
    0.06
    (open
    0.06
    .tabPage
    0.06
     zahrani
    0.06
     γλώ
    0.06
    (APP
    0.06
    [:]↵
    0.06
    "urls
    0.06
    Act Density 0.037%

    No Known Activations