INDEX
    Explanations

    pronouns related to personal experience and identity

    New Auto-Interp
    Negative Logits
     pozw
    -0.44
    ล้ว
    -0.37
    ทอง
    -0.37
    BIAS
    -0.37
     gekomen
    -0.37
     Zähne
    -0.37
     kesulitan
    -0.36
    niversitesi
    -0.36
    richting
    -0.35
     ilmo
    -0.35
    POSITIVE LOGITS
     initComponents
    0.73
    Panamoan
    0.59
     thiệu
    0.54
    Skocz
    0.53
     BaseModel
    0.53
    ="@+
    0.52
     فريبيس
    0.50
    :✨
    0.50
     propOrder
    0.49
    ніципалі
    0.49
    Act Density 0.006%

    No Known Activations