INDEX
    Explanations

    references to military incidents and their potential consequences

    New Auto-Interp
    Negative Logits
    ÏĦÎŃ
    -0.16
    ute
    -0.15
    roy
    -0.15
     Sugar
    -0.15
    661
    -0.14
    eded
    -0.14
    ROY
    -0.14
    inals
    -0.14
    utan
    -0.14
     mole
    -0.14
    POSITIVE LOGITS
     peace
    0.26
    peace
    0.24
    Peace
    0.23
     peaceful
    0.22
     Peace
    0.22
     war
    0.18
    -war
    0.17
     peacefully
    0.17
    æł¸
    0.15
    å¾Ģ
    0.15
    Act Density 0.190%

    No Known Activations