INDEX
    Explanations

    personal pronouns and verbs related to decision-making or choice

    New Auto-Interp
    Negative Logits
    ัà¸ķà¸ĸ
    -0.15
     almost
    -0.14
     Herm
    -0.14
    èĭ¥
    -0.14
    arton
    -0.13
    arges
    -0.13
    witter
    -0.13
     Tup
    -0.13
    elerik
    -0.13
    ÑĢой
    -0.13
    POSITIVE LOGITS
     ever
    0.19
    ever
    0.18
    977
    0.16
    843
    0.16
    ças
    0.16
    ×Ļ
    0.15
    997
    0.15
     EVER
    0.15
     denn
    0.14
    577
    0.14
    Act Density 0.126%

    No Known Activations