INDEX
    Explanations

    references to war and related terminology

    New Auto-Interp
    Negative Logits
     bek
    -0.39
     Encu
    -0.35
     pomo
    -0.31
    ظة
    -0.30
     Stroh
    -0.30
    ところで
    -0.29
    -0.29
     Insert
    -0.29
    verna
    -0.29
     Bühne
    -0.28
    POSITIVE LOGITS
    lords
    0.69
    lord
    0.65
     MainAxisSize
    0.64
    SpringRunner
    0.63
    thog
    0.60
     defaultstate
    0.59
    farin
    0.58
     crimes
    0.58
    crimes
    0.57
     autorytatywna
    0.57
    Act Density 0.171%

    No Known Activations