INDEX
    Explanations

    terms related to fairness or equity

    New Auto-Interp
    Negative Logits
    <bos>
    -3.00
    
    
    -0.93
    -0.85
    /*++
    -0.69
    /**
    -0.69
    lateinit
    -0.68
    <?
    -0.68
    /*
    -0.66
     interact
    -0.64
     develop
    -0.63
    POSITIVE LOGITS
     bandung
    1.51
     maroc
    1.48
     Minang
    1.47
     casio
    1.36
     quoc
    1.35
     cæ
    1.35
     nuoc
    1.34
     napoli
    1.34
     brava
    1.34
     tramont
    1.34
    Act Density 0.078%

    No Known Activations