INDEX
    Explanations

    terms related to political power dynamics and capabilities

    New Auto-Interp
    Negative Logits
    <bos>
    -2.14
    
    
    -0.73
     enshr
    -0.72
     inaugurate
    -0.69
    /**
    -0.68
     harmonize
    -0.68
    -0.64
     abolish
    -0.64
     reunite
    -0.63
    <?
    -0.62
    POSITIVE LOGITS
     paradiso
    1.27
     soggior
    1.10
     bandung
    1.08
     riva
    1.03
     megane
    1.02
     venuto
    0.96
     toscana
    0.95
    !!</
    0.95
     lele
    0.95
     eiffel
    0.95
    Act Density 0.292%

    No Known Activations