INDEX
    Explanations

    words related to war, conflict, and distortion

    New Auto-Interp
    Negative Logits
     republi
    -0.70
     praktik
    -0.66
     kandid
    -0.64
     pól
    -0.64
     gius
    -0.60
     Kün
    -0.60
     trö
    -0.59
     biograf
    -0.58
     granat
    -0.58
     Städ
    -0.57
    POSITIVE LOGITS
     twist
    0.85
     bend
    0.82
     bending
    0.80
     bends
    0.79
     twisting
    0.79
     twisted
    0.79
     bent
    0.78
     distorted
    0.77
     twists
    0.76
     distortion
    0.74
    Act Density 0.143%

    No Known Activations