INDEX
    Explanations

    phrases related to a direction or reduction, such as the word "down" at various levels of intensity

    New Auto-Interp
    Negative Logits
    archives
    -0.83
    achu
    -0.70
    itia
    -0.66
    icles
    -0.64
    andan
    -0.64
    itive
    -0.63
    rament
    -0.63
    ¶ħ
    -0.62
    ificent
    -0.61
    andise
    -0.61
    POSITIVE LOGITS
    LOAD
    1.20
    graded
    1.14
    stairs
    1.01
    grading
    0.95
    hill
    0.90
    pour
    0.82
    loaded
    0.81
     stairs
    0.81
    played
    0.79
    grades
    0.78
    Act Density 2.275%

    No Known Activations