INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ovi
    -0.08
    ्काल
    -0.08
     unver
    -0.07
    Went
    -0.07
     Bolton
    -0.07
     buss
    -0.07
    (Video
    -0.07
    anthem
    -0.07
    .")↵
    -0.07
    abta
    -0.07
    POSITIVE LOGITS
     запр
    0.08
     fractions
    0.07
     Moore
    0.07
     HAS
    0.07
     picky
    0.07
    0.07
     krom
    0.07
    0.07
    .active
    0.07
     geç
    0.07
    Act Density 0.091%

    No Known Activations