INDEX
    Explanations

    first-person pronouns and expressions of personal reflection or opinion

    New Auto-Interp
    Negative Logits
    agar
    -0.15
    eld
    -0.15
    igan
    -0.14
    onds
    -0.14
    acha
    -0.14
    ilder
    -0.14
    ff
    -0.14
     ff
    -0.14
    ÅĻÃŃ
    -0.13
     Porter
    -0.13
    POSITIVE LOGITS
    utto
    0.17
    èĩ
    0.15
     kraj
    0.15
    entarios
    0.14
     Hass
    0.14
     пиÑī
    0.14
    ë¯
    0.14
    Ëĺ
    0.14
    ADC
    0.14
    utow
    0.13
    Act Density 0.076%

    No Known Activations