INDEX
    Explanations

    general references to the concept of "all" or totality

    New Auto-Interp
    Negative Logits
    thren
    -0.67
    rients
    -0.67
    ritic
    -0.65
    reat
    -0.63
    chieve
    -0.63
    elve
    -0.62
    atro
    -0.61
    pperc
    -0.61
    eele
    -0.61
    atches
    -0.60
    POSITIVE LOGITS
    uding
    1.04
    usion
    0.99
    uring
    0.96
    ocating
    0.96
    ocated
    0.82
     except
    0.79
    usions
    0.78
    ayed
    0.77
     encomp
    0.76
    udes
    0.75
    Act Density 0.019%

    No Known Activations