INDEX
    Explanations

    references to events and changes over time

    New Auto-Interp
    Negative Logits
     eux
    -0.16
     него
    -0.14
    ť
    -0.14
    him
    -0.14
     lui
    -0.14
    наÑĩе
    -0.14
     Yourself
    -0.14
    /***/
    -0.14
     herself
    -0.14
     ниÑħ
    -0.13
    POSITIVE LOGITS
     things
    0.36
     nothing
    0.33
     many
    0.31
     everything
    0.31
     certain
    0.29
     there
    0.29
     additional
    0.28
     none
    0.27
     anything
    0.26
     lots
    0.26
    Act Density 0.265%

    No Known Activations