INDEX
    Explanations

    actions and verbs related to approval and desire

    New Auto-Interp
    Negative Logits
    iÄįka
    -0.15
    oons
    -0.15
    ép
    -0.15
    ãĥ³ãĥij
    -0.14
    cio
    -0.14
    awks
    -0.14
    achuset
    -0.14
     NORMAL
    -0.13
    çĭIJ
    -0.13
    -leg
    -0.13
    POSITIVE LOGITS
    ande
    0.15
    ierce
    0.14
    boa
    0.14
    ONGL
    0.14
    __$
    0.14
    esse
    0.14
    dr
    0.14
    ız
    0.14
    ÑĮÑı
    0.13
    cape
    0.13
    Act Density 0.040%

    No Known Activations