INDEX
    Explanations

    descriptive adjectives that characterize intensity or size

    New Auto-Interp
    Negative Logits
    <bos>
    -2.33
    /***
    
    -0.96
    ///**
    -0.90
     defray
    -0.78
     ratify
    -0.76
     avrebbero
    -0.76
     endow
    -0.74
     intersper
    -0.73
    -0.71
    <?
    
    -0.70
    POSITIVE LOGITS
     asado
    0.80
    hematical
    0.80
     ados
    0.76
     vinci
    0.74
     maroc
    0.73
    GRAPHS
    0.71
     quoc
    0.71
    lindo
    0.70
     cuit
    0.70
    mistak
    0.70
    Act Density 0.336%

    No Known Activations