INDEX
    Explanations

    phrases emphasizing self-reference or introspective concepts

    New Auto-Interp
    Negative Logits
     nakalista
    -0.78
    ########.
    -0.71
     kürzlich
    -0.70
     recentemente
    -0.68
     récemment
    -0.68
    enuta
    -0.67
     daglig
    -0.66
     itinéraires
    -0.65
    zepine
    -0.65
     lenker
    -0.64
    POSITIVE LOGITS
     itself
    2.29
    itself
    2.09
     Itself
    1.92
     themselves
    1.21
     sich
    1.09
    themselves
    1.08
    本身
    0.98
     itſelf
    0.95
     zich
    0.90
     zichzelf
    0.81
    Act Density 0.103%

    No Known Activations