INDEX
    Explanations

    expressions related to progress or updates

    the word "now" and the phrase "also," indicating a focus on current events or updates

    New Auto-Interp
    Negative Logits
    Subject
    -0.70
     omn
    -0.60
    Rap
    -0.58
     Mats
    -0.58
    avier
    -0.56
    Behind
    -0.55
    harm
    -0.55
    etting
    -0.54
    onto
    -0.53
     disobedience
    -0.53
    POSITIVE LOGITS
    been
    1.55
     been
    1.32
     undergone
    1.09
     gone
    0.97
     become
    0.96
     gotten
    0.96
     Been
    0.95
    gone
    0.93
     fallen
    0.92
     begun
    0.90
    Act Density 0.151%

    No Known Activations