INDEX
    Explanations

    instances of attributed speech or reporting phrases

    New Auto-Interp
    Negative Logits
     spin
    -0.18
     spit
    -0.15
     Spin
    -0.15
     Gil
    -0.14
     sch
    -0.14
     pr
    -0.14
     Mou
    -0.14
    esign
    -0.14
    ancial
    -0.13
     mou
    -0.13
    POSITIVE LOGITS
    ách
    0.16
    tems
    0.16
    ymi
    0.15
    /values
    0.14
    zug
    0.14
    ato
    0.14
     dcc
    0.14
    _abort
    0.14
     writes
    0.14
    atars
    0.13
    Act Density 0.037%

    No Known Activations