INDEX
    Explanations

    references to cultural standards and perceptions of beauty

    New Auto-Interp
    Negative Logits
    <bos>
    -2.18
     intersper
    -0.61
     enshr
    -0.57
     amass
    -0.54
    //---
    -0.53
    /***
    
    -0.53
     condense
    -0.52
     tentatively
    -0.52
     defray
    -0.51
     harmonize
    -0.51
    POSITIVE LOGITS
     anymore
    0.99
     signora
    0.92
     bandung
    0.91
     quoique
    0.82
     nor
    0.81
    postolic
    0.81
     tristes
    0.80
     warung
    0.79
     jawa
    0.78
    quarelle
    0.77
    Act Density 1.232%

    No Known Activations