INDEX
    Explanations

    instances of self-reference and personal opinion expressions

    New Auto-Interp
    Negative Logits
    breadcrumbs
    -0.15
     Understand
    -0.15
    ovich
    -0.14
    šak
    -0.14
    гл
    -0.14
    sounds
    -0.14
     пад
    -0.14
    hall
    -0.14
    arat
    -0.13
    igest
    -0.13
    POSITIVE LOGITS
     comparison
    0.18
     recalled
    0.18
     argument
    0.18
     nhỼ
    0.17
     remembered
    0.16
    comparison
    0.16
     Comparison
    0.16
    onder
    0.16
     reminder
    0.16
    Comparison
    0.16
    Act Density 0.029%

    No Known Activations