INDEX
    Explanations

    war and prisoners

    New Auto-Interp
    Negative Logits
     conception
    -0.08
     justified
    -0.08
     impr
    -0.08
    -scale
    -0.08
     correlated
    -0.08
    hoog
    -0.08
    ferencing
    -0.08
     lighting
    -0.08
    painting
    -0.07
    igation
    -0.07
    POSITIVE LOGITS
     exchanges
    0.09
     prisoner
    0.09
     exchanging
    0.09
     Exchanges
    0.09
    Enumerator
    0.09
     exchanged
    0.09
    交換
    0.09
     échanges
    0.08
     prisoners
    0.08
    scp
    0.08
    Act Density 0.011%

    No Known Activations