INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -wave
    -0.07
     principio
    -0.06
     вред
    -0.06
    4
    -0.06
    Número
    -0.06
     supreme
    -0.06
     ADV
    -0.06
    Highlight
    -0.06
    A
    -0.06
     scanner
    -0.06
    POSITIVE LOGITS
     they
    0.14
     They
    0.12
    they
    0.10
    They
    0.09
     THEY
    0.09
     he
    0.08
    .They
    0.07
    ################################
    0.07
     she
    0.07
     it
    0.07
    Act Density 0.135%

    No Known Activations