INDEX
    Explanations

    first-person pronouns and references to personal experiences

    New Auto-Interp
    Negative Logits
     ev
    -0.15
    ilim
    -0.14
     McK
    -0.14
    ÑĢана
    -0.13
    OTE
    -0.13
    .providers
    -0.13
     stan
    -0.13
    Jerry
    -0.13
    .PO
    -0.13
    outers
    -0.13
    POSITIVE LOGITS
     âĹĦ
    0.16
    spo
    0.15
    .baidu
    0.15
     jclass
    0.14
    ulle
    0.14
    okol
    0.14
    .nih
    0.13
     jinak
    0.13
    _REPLY
    0.13
     málo
    0.13
    Act Density 0.053%

    No Known Activations