INDEX
    Explanations

    phrases related to decision-making or organizational structure

    New Auto-Interp
    Negative Logits
    <bos>
    -2.91
    /**
    -0.77
    <?
    -0.69
    
    
    -0.69
     harmonize
    -0.66
     displace
    -0.65
     disbur
    -0.64
     coexist
    -0.63
     expel
    -0.62
     cooperated
    -0.60
    POSITIVE LOGITS
     jawa
    1.08
     magis
    1.00
     venuto
    0.95
     milano
    0.94
     gamba
    0.89
     affez
    0.89
     napoli
    0.89
     maroc
    0.88
     paradiso
    0.88
     roberto
    0.87
    Act Density 0.199%

    No Known Activations