INDEX
    Explanations

    First-person pronoun

    New Auto-Interp
    Negative Logits
    زار
    -0.08
    ladatel
    -0.07
    sumer
    -0.07
    я
    -0.07
     Soap
    -0.06
    printw
    -0.06
    StepThrough
    -0.06
     реч
    -0.06
    addAction
    -0.06
    emachine
    -0.06
    POSITIVE LOGITS
     Half
    0.07
     langs
    0.07
    _tF
    0.06
     forget
    0.06
     
    0.06
    >}↵
    0.06
     MainMenu
    0.06
    acs
    0.06
    AIM
    0.06
     collect
    0.06
    Act Density 0.164%

    No Known Activations