INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    r
    1.21
    a
    1.19
    e
    1.07
    atr
    1.01
    i
    0.98
    ו
    0.97
    ar
    0.96
    0.96
    י
    0.95
    etr
    0.94
    POSITIVE LOGITS
     Ning
    0.88
     dormir
    0.88
    ules
    0.87
    ningen
    0.85
    uling
    0.85
     Himal
    0.85
     Helms
    0.82
    Ning
    0.82
    0.82
     Tunes
    0.81
    Act Density 0.186%

    No Known Activations