INDEX
    Explanations

    instances of self-referential relationships or concepts

    New Auto-Interp
    Negative Logits
    :init
    -0.16
    ixo
    -0.16
    ابÙĬ
    -0.15
    ابÛĮ
    -0.14
    екаÑĢ
    -0.14
    NavController
    -0.14
    esium
    -0.14
    _READY
    -0.14
    oine
    -0.14
    _TAC
    -0.13
    POSITIVE LOGITS
     itself
    0.52
     themselves
    0.48
     himself
    0.46
    èĩªèº«
    0.43
    èĩªå·±
    0.42
     herself
    0.42
     Himself
    0.38
     own
    0.38
     ourselves
    0.36
     oneself
    0.35
    Act Density 0.231%

    No Known Activations