INDEX
    Explanations

    pronouns and verbs indicating relational dynamics or agency in context

    New Auto-Interp
    Negative Logits
    cona
    -0.18
    antha
    -0.17
     Acres
    -0.16
    hos
    -0.15
    åĪĩãĤĬ
    -0.15
    ÏĨα
    -0.15
    iyan
    -0.15
    ाहर
    -0.15
    witch
    -0.15
     meis
    -0.15
    POSITIVE LOGITS
    oke
    0.17
    Dyn
    0.15
    undo
    0.15
    ritten
    0.14
    kin
    0.14
    erable
    0.14
    istol
    0.14
    oc
    0.14
    eb
    0.14
    EB
    0.14
    Act Density 0.001%

    No Known Activations