INDEX
    Explanations

    phrases related to effort and activity levels

    New Auto-Interp
    Negative Logits
    upro
    -0.15
    byterian
    -0.15
    enders
    -0.14
     Kurul
    -0.14
    raf
    -0.14
    ufs
    -0.14
    weets
    -0.14
    iais
    -0.14
    iversit
    -0.13
    çĩ
    -0.13
    POSITIVE LOGITS
    ocha
    0.17
    aday
    0.17
    cket
    0.17
    placer
    0.16
    orex
    0.15
    oney
    0.15
    estre
    0.14
    åĿĬ
    0.14
    leta
    0.14
    omba
    0.14
    Act Density 0.084%

    No Known Activations