INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     amélior
    -0.08
     migli
    -0.08
     melhora
    -0.07
    inactive
    -0.07
    spy
    -0.07
    834
    -0.07
    shows
    -0.07
     boz
    -0.07
     randint
    -0.07
    联网
    -0.07
    POSITIVE LOGITS
     genocide
    0.11
     atrocities
    0.10
     massacre
    0.10
     massac
    0.10
     orchestr
    0.09
     Holocaust
    0.09
     tragedy
    0.09
     oppressed
    0.08
     extermin
    0.08
     persecution
    0.08
    Act Density 0.010%

    No Known Activations