INDEX
    Explanations

    conditional phrases that suggest hesitance or hypothetical scenarios

    New Auto-Interp
    Negative Logits
     him
    -0.14
     lui
    -0.14
     eux
    -0.14
    apon
    -0.14
     herself
    -0.14
     Probably
    -0.14
    ovky
    -0.14
    them
    -0.14
     tieten
    -0.14
    annies
    -0.13
    POSITIVE LOGITS
     they
    0.35
    rames
    0.33
     there
    0.33
     indeed
    0.32
     anything
    0.30
    fy
    0.30
     you
    0.29
     it
    0.29
     we
    0.29
     nothing
    0.28
    Act Density 0.197%

    No Known Activations