INDEX
    Explanations

    a mix of personal pronouns and verb forms, particularly focusing on the expression of emotions and self-reflection

    New Auto-Interp
    Negative Logits
    wash
    -0.14
    367
    -0.14
    _aspect
    -0.14
     Gilbert
    -0.14
    inka
    -0.13
     indis
    -0.13
    yna
    -0.13
     çĤ
    -0.13
    antium
    -0.13
    undo
    -0.13
    POSITIVE LOGITS
    SCALL
    0.15
     Sexo
    0.14
    setQuery
    0.14
     Kostenlose
    0.14
    @Id
    0.14
    inan
    0.14
     åľ
    0.13
    ัà¸ĩà¸ģ
    0.13
    åľ°
    0.13
    æķ
    0.13
    Act Density 0.002%

    No Known Activations