INDEX
    Explanations

    code and programming comments

    New Auto-Interp
    Negative Logits
    CH
    0.84
    دون
    0.81
    Е
    0.78
     I
    0.77
    ד
    0.73
    드를
    0.72
    دت
    0.72
    م
    0.72
    ding
    0.71
    dB
    0.69
    POSITIVE LOGITS
    u
    1.11
    ul
    0.92
    и
    0.92
    ير
    0.91
    _
    0.91
    is
    0.88
     were
    0.87
    ze
    0.87
    on
    0.86
    zed
    0.84
    Act Density 0.018%

    No Known Activations