INDEX
    Explanations

    pronouns indicating personal relationships and interactions

    New Auto-Interp
    Negative Logits
    kowski
    -0.15
    iggins
    -0.14
    _INCLUDED
    -0.14
    iven
    -0.14
    adb
    -0.14
     breeze
    -0.13
    strup
    -0.13
    itur
    -0.13
    ↵↵
    -0.13
    èijī
    -0.13
    POSITIVE LOGITS
    lash
    0.14
    ptions
    0.14
    99
    0.13
    olvency
    0.13
    rypto
    0.13
    éĢł
    0.13
    enta
    0.13
    alar
    0.13
    anders
    0.12
    akening
    0.12
    Act Density 0.144%

    No Known Activations