INDEX
    Explanations

    phrases indicating a choice or alternative

    phrases that indicate alternatives or choices

    New Auto-Interp
    Negative Logits
    then
    -0.86
    ocracy
    -0.78
    NOW
    -0.76
    ires
    -0.72
    eth
    -0.69
    erest
    -0.67
    english
    -0.64
    our
    -0.62
    ETS
    -0.62
    now
    -0.61
    POSITIVE LOGITS
    ifice
    1.44
    chard
    1.43
    nam
    1.43
    acle
    1.42
    acles
    1.37
    chid
    1.35
     alternatively
    1.25
     otherwise
    1.24
    GAN
    1.17
    Else
    1.11
    Act Density 0.169%

    No Known Activations