INDEX
    Explanations

    temporal references and indications of time-related events

    New Auto-Interp
    Negative Logits
     him
    -0.17
    him
    -0.16
     lui
    -0.16
    ÙĴÙĩ
    -0.16
    ersh
    -0.15
     whats
    -0.15
    icamente
    -0.14
    /il
    -0.14
    ysqli
    -0.14
     THEM
    -0.14
    POSITIVE LOGITS
     they
    0.30
     that
    0.25
     we
    0.20
    that
    0.19
     mÃł
    0.18
     she
    0.18
     it
    0.17
     he
    0.17
     everything
    0.17
     THEY
    0.16
    Act Density 0.063%

    No Known Activations