INDEX
    Explanations

    mention of conflicts or battles

    the term "wars" and its variations in context

    New Auto-Interp
    Negative Logits
    gow
    -0.76
    YL
    -0.66
    Dialogue
    -0.66
    opathy
    -0.65
    obook
    -0.62
    OGR
    -0.61
    uration
    -0.61
    STER
    -0.61
    urated
    -0.61
     Accuracy
    -0.60
    POSITIVE LOGITS
    hip
    1.26
    hips
    1.14
     waged
    0.95
     raged
    0.83
     wars
    0.83
    pread
    0.83
     raging
    0.83
    pite
    0.82
    pace
    0.82
     fought
    0.79
    Act Density 0.044%

    No Known Activations