INDEX
    Explanations

    phrases indicating the allocation or prioritization of importance or value

    New Auto-Interp
    Negative Logits
    zeug
    -0.16
    opup
    -0.15
    trag
    -0.15
    /from
    -0.15
     Avec
    -0.15
    ecz
    -0.15
    ape
    -0.14
    illy
    -0.14
    atsu
    -0.14
     apply
    -0.14
    POSITIVE LOGITS
     emphasis
    0.30
     bets
    0.29
     importance
    0.23
     blame
    0.23
    emphasis
    0.23
     Importance
    0.21
     emphasize
    0.19
     demands
    0.19
     placed
    0.18
     Limits
    0.18
    Act Density 0.044%

    No Known Activations