INDEX
    Explanations

    pronouns and references to personal perspectives in discussions

    New Auto-Interp
    Negative Logits
    apid
    -0.15
    abler
    -0.15
    OVE
    -0.15
    /repos
    -0.15
    erti
    -0.14
    adir
    -0.14
    άκ
    -0.14
    اخ
    -0.14
    Å¡tÄĽ
    -0.14
    apus
    -0.13
    POSITIVE LOGITS
     think
    0.64
     thinks
    0.55
     Think
    0.52
    think
    0.52
    Think
    0.49
     feel
    0.43
     THINK
    0.42
     believe
    0.41
     feels
    0.40
    认为
    0.38
    Act Density 0.297%

    No Known Activations