INDEX
    Explanations

    names related to individuals in a political context, particularly involving North Korea

    New Auto-Interp
    Negative Logits
     replay
    -0.65
     MOT
    -0.64
     multipl
    -0.64
     destro
    -0.64
     Digest
    -0.63
     macros
    -0.62
     REAL
    -0.61
     ditch
    -0.61
     defin
    -0.61
    ATTLE
    -0.61
    POSITIVE LOGITS
    sung
    1.16
    jin
    1.09
    Yang
    1.04
    wei
    1.04
    Hong
    1.03
    kai
    0.99
    Yu
    0.97
    Yan
    0.97
    hun
    0.97
    Tai
    0.96
    Act Density 0.040%

    No Known Activations