INDEX
    Explanations

    references to family members and parental figures

    New Auto-Interp
    Negative Logits
     itself
    -0.20
    ATUS
    -0.15
    ROME
    -0.14
    WARE
    -0.14
    æľ¬
    -0.13
    egot
    -0.13
    ôme
    -0.13
    asil
    -0.13
    ç®±
    -0.13
    ubi
    -0.13
    POSITIVE LOGITS
    大人
    0.18
    -in
    0.17
    /gr
    0.17
    /legal
    0.15
    ilit
    0.15
    lessness
    0.15
     remar
    0.14
    ovny
    0.14
    ondo
    0.14
    dek
    0.14
    Act Density 0.060%

    No Known Activations