INDEX
    Explanations

    phrases related to planning, agency, and social action

    New Auto-Interp
    Negative Logits
    atron
    -0.17
    енка
    -0.15
    thro
    -0.15
    ndx
    -0.14
    acin
    -0.14
    äch
    -0.14
     libertin
    -0.14
    etzt
    -0.14
    873
    -0.14
    awn
    -0.14
    POSITIVE LOGITS
     support
    0.16
     save
    0.15
     material
    0.15
     prolong
    0.14
     live
    0.14
     lit
    0.14
     jump
    0.14
     the
    0.14
     Jump
    0.14
     Kob
    0.13
    Act Density 0.262%

    No Known Activations