INDEX
    Explanations

    phrases suggesting hypothetical situations with specific actions or consequences

    conditional statements and hypothetical scenarios

    New Auto-Interp
    Negative Logits
    TION
    -0.66
    ahead
    -0.61
    Enough
    -0.59
    Cook
    -0.58
     Brill
    -0.58
    Ready
    -0.58
    sis
    -0.57
     Ahead
    -0.57
    Tok
    -0.57
     contained
    -0.57
    POSITIVE LOGITS
     expect
    1.31
     imagine
    1.29
     suppose
    1.19
     assume
    1.13
     presume
    1.12
     wonder
    1.03
    agine
    0.99
     speculate
    0.99
     think
    0.97
     argue
    0.96
    Act Density 0.072%

    No Known Activations