INDEX
    Explanations

    words related to actions or activities, particularly those involving planning, decision-making, or structured processes

    New Auto-Interp
    Negative Logits
    ijd
    -0.15
    avan
    -0.14
    vit
    -0.14
    cape
    -0.14
    ktop
    -0.14
    ael
    -0.14
    usan
    -0.14
    διο
    -0.13
    avern
    -0.13
    aul
    -0.13
    POSITIVE LOGITS
     Kang
    0.18
    etes
    0.15
    ENTIAL
    0.15
     Meadows
    0.15
    jeta
    0.15
    ungs
    0.15
    UNG
    0.15
    swire
    0.14
    roys
    0.14
    angelo
    0.14
    Act Density 0.014%

    No Known Activations