INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    z
    1.13
     l
    0.85
    s
    0.84
    0.83
    he
    0.80
    j
    0.79
    y
    0.77
    '
    0.77
     
    0.75
    b
    0.73
    POSITIVE LOGITS
    פו
    0.92
    yatiti
    0.88
     stargazerCount
    0.87
     tathapi
    0.84
     agacch
    0.83
    <unused1823>
    0.82
     ['(?
    0.82
     overjoyed
    0.82
     vibhav
    0.82
    <unused1763>
    0.82
    Act Density 0.000%

    No Known Activations