INDEX
    Explanations

    fully followed by state

    New Auto-Interp
    Negative Logits
    l
    1.45
    t
    1.24
    x
    1.20
    n
    1.16
    v
    1.16
    Q
    1.16
    p
    1.16
    el
    1.06
    Z
    1.03
    z
    1.02
    POSITIVE LOGITS
     fully
    1.38
    ate
    1.26
     полностью
    1.22
     Fully
    1.11
    ación
    1.05
     повністю
    1.02
    fully
    0.97
    ín
    0.96
     sepenuhnya
    0.95
    ové
    0.92
    Act Density 0.011%

    No Known Activations