INDEX
    Explanations

    words related to statements indicating an instruction or action

    instances of the word "all" and variations of capitalization

    New Auto-Interp
    Negative Logits
    hyde
    -0.76
     Kamp
    -0.76
    rir
    -0.70
    ãĤ©
    -0.69
     mathemat
    -0.67
    uable
    -0.64
    sein
    -0.64
    rought
    -0.63
     Gork
    -0.63
    unin
    -0.62
    POSITIVE LOGITS
    igator
    1.04
    ocations
    0.94
    iances
    0.92
    iance
    0.92
    usions
    0.90
    ergic
    0.88
    ocation
    0.88
    igators
    0.86
    owed
    0.85
     sorts
    0.84
    Act Density 0.058%

    No Known Activations