INDEX
    Explanations

    verbs related to actions or events

    New Auto-Interp
    Negative Logits
    ibr
    -0.75
    Tweet
    -0.70
    akra
    -0.65
    jar
    -0.64
    je
    -0.64
    deb
    -0.63
    paper
    -0.63
    umar
    -0.63
    tal
    -0.63
    ib
    -0.63
    POSITIVE LOGITS
    all
    1.47
     all
    1.38
     ALL
    1.38
    All
    1.20
     All
    1.18
    ALL
    1.13
     none
    1.06
     everything
    1.05
     entirety
    1.01
     EVERY
    1.00
    Act Density 0.260%

    No Known Activations