INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    an
    1.19
    f
    1.17
    {
    1.09
    2
    1.01
    of
    0.92
    gebaut
    0.89
    5
    0.89
    се
    0.88
    _{
    0.87
    <0x80>
    0.86
    POSITIVE LOGITS
    م
    1.59
    т
    1.41
    м
    1.38
     on
    1.30
    ل
    1.30
    مین
    1.24
    י
    1.23
    ت
    1.23
     sticks
    1.13
    ي
    1.09
    Act Density 0.018%

    No Known Activations