INDEX
    Explanations

    references to personal identity and familial relationships

    New Auto-Interp
    Negative Logits
     naturally
    -0.15
    ombo
    -0.15
     straight
    -0.15
     bình
    -0.14
     Syn
    -0.14
    anta
    -0.14
    edla
    -0.14
     Virgin
    -0.14
     synchron
    -0.14
    ruk
    -0.14
    POSITIVE LOGITS
    DDR
    0.16
    ROS
    0.15
    dock
    0.14
    جاد
    0.14
    jas
    0.14
    çģ
    0.14
     Inflate
    0.14
    igos
    0.14
     اÙĨتظ
    0.14
    _BB
    0.14
    Act Density 0.000%

    No Known Activations