INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Bücher
    0.73
     Lhasa
    0.70
     Australie
    0.66
     रीड
    0.65
     Exel
    0.63
    ιση
    0.62
     lectores
    0.61
     Playboy
    0.61
     Köz
    0.61
    ?</
    0.60
    POSITIVE LOGITS
    ة
    0.61
    ;
    0.59
    devices
    0.54
    0.54
    dal
    0.53
    icillin
    0.53
     type
    0.52
    fruit
    0.52
    ية
    0.51
     ہمارا
    0.50
    Act Density 0.008%

    No Known Activations