INDEX
    Explanations

    adjectives related to physical characteristics

    New Auto-Interp
    Negative Logits
     vété
    -0.79
     unspeak
    -0.76
     fameux
    -0.72
     congrès
    -0.70
     miroir
    -0.70
     tournant
    -0.68
     appui
    -0.68
     nuage
    -0.67
     complément
    -0.67
     levier
    -0.66
    POSITIVE LOGITS
     vogli
    1.00
     ideolog
    0.99
     utop
    0.95
     succede
    0.93
     solidar
    0.89
    <bos>
    0.87
     gymnas
    0.79
     voleva
    0.78
     patin
    0.75
     atle
    0.75
    Act Density 0.269%

    No Known Activations