INDEX
    Explanations

    assertions and statements about the existence or state of objects and events

    New Auto-Interp
    Negative Logits
    acr
    -0.15
    _SIG
    -0.14
    ÃŃst
    -0.13
    trusted
    -0.13
     bunları
    -0.13
     beden
    -0.13
    óz
    -0.12
    icular
    -0.12
    argin
    -0.12
    essaging
    -0.12
    POSITIVE LOGITS
     ours
    0.22
     mine
    0.21
    mine
    0.20
     what
    0.19
     from
    0.19
     done
    0.19
     happening
    0.19
     theirs
    0.19
     on
    0.19
     why
    0.18
    Act Density 1.296%

    No Known Activations