INDEX
    Explanations

    phrases related to self-identification and personal claims

    New Auto-Interp
    Negative Logits
    neau
    -0.17
    OLA
    -0.15
    owe
    -0.15
    vou
    -0.15
    olec
    -0.15
     Cron
    -0.15
    .LookAndFeel
    -0.15
    je
    -0.15
    ÑĢами
    -0.14
     Vys
    -0.14
    POSITIVE LOGITS
    FRING
    0.15
    836
    0.15
    847
    0.15
    èıĮ
    0.14
    970
    0.14
    ppard
    0.14
    804
    0.14
    reso
    0.14
    zag
    0.13
    chn
    0.13
    Act Density 0.193%

    No Known Activations