INDEX
    Explanations

    mentions of forces, troops, and conflicts in different contexts

    New Auto-Interp
    Negative Logits
    <bos>
    -1.71
    
    
    -0.78
    /***
    
    -0.74
    <?
    
    -0.71
    /**
    -0.69
    -0.68
    <?
    -0.68
    public
    -0.67
    //---
    -0.66
     abolish
    -0.59
    POSITIVE LOGITS
     forces
    1.87
     Forces
    1.74
    Forces
    1.72
    forces
    1.56
     FORCES
    1.45
     Minang
    1.43
     force
    1.39
     hcm
    1.29
     nuoc
    1.26
     Force
    1.22
    Act Density 0.104%

    No Known Activations