INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    av
    1.02
    am
    1.00
    ag
    1.00
    ai
    0.99
    il
    0.86
    ah
    0.85
    it
    0.84
    ak
    0.84
    im
    0.82
    ed
    0.81
    POSITIVE LOGITS
    0.83
    0.80
    ков
    0.80
    ه
    0.75
    0.75
    ImgBoard
    0.71
    0.71
    ουν
    0.70
    -
    0.68
    0.68
    Act Density 0.000%

    No Known Activations