INDEX
    Explanations

    mentions of recent events or developments

    New Auto-Interp
    Negative Logits
     currently
    -0.26
     recent
    -0.25
     presently
    -0.23
     eventual
    -0.23
     later
    -0.22
     eventually
    -0.21
     latest
    -0.20
    currently
    -0.19
     recently
    -0.19
    å½ĵåīį
    -0.19
    POSITIVE LOGITS
    /current
    0.28
    ral
    0.23
    ;y
    0.22
    emente
    0.22
    elijk
    0.22
    -ish
    0.19
    LY
    0.19
     addition
    0.18
     grads
    0.18
    ness
    0.18
    Act Density 0.047%

    No Known Activations