INDEX
    Explanations

    personal pronouns and expressions of refusal or decision-making

    New Auto-Interp
    Negative Logits
    achs
    -0.19
     is
    -0.17
    pires
    -0.17
     ам
    -0.16
    Am
    -0.15
     Yesterday
    -0.14
    -Am
    -0.14
    Yesterday
    -0.14
    feed
    -0.14
     bere
    -0.14
    POSITIVE LOGITS
     think
    0.31
     remember
    0.27
    think
    0.26
     guess
    0.24
     THINK
    0.22
     Think
    0.22
    Think
    0.21
     wish
    0.21
     suppose
    0.21
    'll
    0.20
    Act Density 0.189%

    No Known Activations