INDEX
    Explanations

    references to personal identity and feelings of connection or isolation

    New Auto-Interp
    Negative Logits
    ancer
    -0.16
    anta
    -0.16
     Impossible
    -0.15
     impossible
    -0.14
    .configure
    -0.13
     lên
    -0.13
    n
    -0.13
    arken
    -0.13
    öz
    -0.13
    Impossible
    -0.13
    POSITIVE LOGITS
     inside
    0.20
     somewhere
    0.20
     within
    0.18
     داخÙĦ
    0.17
     near
    0.17
     ợ
    0.17
     Inside
    0.17
    inside
    0.16
    _inside
    0.16
    Inside
    0.16
    Act Density 0.130%

    No Known Activations