INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    nov
    0.70
    associazione
    0.64
    nič
    0.62
    spectral
    0.62
    ς
    0.61
    los
    0.61
    substantial
    0.61
    вих
    0.61
    vap
    0.59
    laser
    0.56
    POSITIVE LOGITS
     of
    0.69
     ت
    0.65
     ي
    0.65
    0.63
     cili
    0.59
     de
    0.59
     t
    0.57
     koko
    0.57
     hearing
    0.56
     Ó
    0.55
    Act Density 0.001%

    No Known Activations