INDEX
    Explanations

    the word "or" with a strong activation value

    phrases indicating alternatives or choices

    New Auto-Interp
    Negative Logits
     Pony
    -0.90
    ocracy
    -0.76
    ocrats
    -0.74
    ocrat
    -0.72
    achine
    -0.71
    ulhu
    -0.70
    ascus
    -0.70
    ARDS
    -0.68
     Rocket
    -0.66
    efer
    -0.66
    POSITIVE LOGITS
    chard
    1.35
    Else
    1.35
    nam
    1.29
    acles
    1.27
    ifice
    1.23
    acle
    1.23
    nery
    1.21
    chid
    1.14
     otherwise
    1.07
     else
    1.05
    Act Density 0.145%

    No Known Activations