INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    t
    1.04
    d
    0.98
    os
    0.98
    it
    0.82
    l
    0.82
    is
    0.82
    r
    0.81
    m
    0.79
    ological
    0.76
    ac
    0.74
    POSITIVE LOGITS
     DVDs
    1.22
    DVD
    0.96
     DVD
    0.90
     dvd
    0.77
    шены
    0.75
    MORDOR
    0.74
     are
    0.71
    рина
    0.70
     VHS
    0.70
    0.70
    Act Density 0.003%

    No Known Activations