INDEX
    Explanations

    phrases indicating uncertainty or conditions related to actions and outcomes

    New Auto-Interp
    Negative Logits
    Ỽ
    -0.15
    iqu
    -0.13
    olk
    -0.13
     nors
    -0.13
    istr
    -0.13
    we
    -0.13
    czas
    -0.13
    ar
    -0.13
    anca
    -0.13
    ateurs
    -0.13
    POSITIVE LOGITS
     having
    0.35
    having
    0.26
     Having
    0.24
    Having
    0.24
     knowing
    0.20
     being
    0.18
     allowing
    0.18
     ayant
    0.18
     Presence
    0.17
     doing
    0.17
    Act Density 0.530%

    No Known Activations