INDEX
    Explanations

    references to peace, reconciliation, and non-violence in various contexts

    New Auto-Interp
    Negative Logits
    aille
    -0.17
    λαν
    -0.16
    alie
    -0.15
     몰
    -0.14
    豪
    -0.14
    ileged
    -0.14
    uhn
    -0.14
    RIX
    -0.14
    ÃŃnh
    -0.14
    assy
    -0.14
    POSITIVE LOGITS
     peace
    0.76
     Peace
    0.69
    peace
    0.66
    Peace
    0.65
     peaceful
    0.55
     pac
    0.48
     peacefully
    0.44
    pac
    0.35
     Pac
    0.34
     disarm
    0.32
    Act Density 0.217%

    No Known Activations