INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     amici
    0.62
     adventurers
    0.57
     Miloš
    0.56
     Víctor
    0.55
    رم
    0.53
     collectiv
    0.52
    ند
    0.52
     étrangers
    0.52
     Vuk
    0.52
     niv
    0.52
    POSITIVE LOGITS
    t
    1.05
    an
    1.01
    i
    0.84
    ed
    0.80
    n
    0.76
    to
    0.73
    _
    0.70
    es
    0.70
    :
    0.68
    de
    0.66
    Act Density 0.002%

    No Known Activations