INDEX
    Explanations

    references to personal pronouns and expressions of connection or interaction between individuals

    New Auto-Interp
    Negative Logits
    anto
    -0.16
    ะà¹ģ
    -0.15
    sti
    -0.15
    amp
    -0.14
    ForKey
    -0.14
    stoff
    -0.14
    antor
    -0.14
    scar
    -0.13
    abay
    -0.13
    ánt
    -0.13
    POSITIVE LOGITS
    ÑĤеÑĢн
    0.15
    lac
    0.15
    336
    0.15
    366
    0.14
    386
    0.14
    328
    0.14
    gle
    0.14
    Fmt
    0.14
    .scalablytyped
    0.14
    365
    0.14
    Act Density 0.137%

    No Known Activations