INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sizin
    -0.07
    女性
    -0.06
     franchise
    -0.06
     bytes
    -0.06
     erratic
    -0.06
     مشکل
    -0.06
     باید
    -0.06
     extremism
    -0.06
    ül
    -0.06
     Cumhurbaşkanı
    -0.06
    POSITIVE LOGITS
    BOOK
    0.07
     encourage
    0.07
     scout
    0.07
     facility
    0.06
    (sub
    0.06
    (util
    0.06
     Pretty
    0.06
     brains
    0.06
    Perl
    0.06
    <path
    0.06
    Act Density 0.004%

    No Known Activations