INDEX
    Explanations

    mentions of ongoing actions or events

    New Auto-Interp
    Negative Logits
    ernet
    -0.20
    azor
    -0.17
    stra
    -0.15
    exus
    -0.15
    aldi
    -0.14
    acus
    -0.14
    inte
    -0.14
    ısından
    -0.14
    urette
    -0.14
    states
    -0.14
    POSITIVE LOGITS
     wrong
    0.27
     happen
    0.24
     happening
    0.23
     bump
    0.23
    -ons
    0.21
     happened
    0.20
     ons
    0.20
    wrong
    0.19
     down
    0.19
     happens
    0.19
    Act Density 0.017%

    No Known Activations