INDEX
    Explanations

    First/second person pronouns

    New Auto-Interp
    Negative Logits
    Receive
    -0.07
    Zen
    -0.06
     Mech
    -0.06
    .enterprise
    -0.06
    Inspect
    -0.06
     centro
    -0.06
     самом
    -0.06
     Happ
    -0.06
    .centerX
    -0.06
     apare
    -0.06
    POSITIVE LOGITS
    _Play
    0.06
    -score
    0.06
     *\
    0.06
     outlet
    0.06
    0.06
     crafted
    0.06
    bytes
    0.06
     defenses
    0.06
    ervation
    0.06
    isers
    0.06
    Act Density 0.072%

    No Known Activations