INDEX
    Explanations

    detect, process, measure, disable

    New Auto-Interp
    Negative Logits
    n
    0.71
    use
    0.68
    o
    0.63
    r
    0.63
    il
    0.62
    u
    0.61
    g
    0.60
    pe
    0.60
    ir
    0.59
    w
    0.59
    POSITIVE LOGITS
     diri
    0.70
     araştırm
    0.66
    فيد
    0.59
     lira
    0.58
     zar
    0.57
     contenido
    0.57
     dipende
    0.56
     וה
    0.55
     strumento
    0.55
     bahkan
    0.55
    Act Density 0.000%

    No Known Activations