INDEX
    Explanations

    phrases that express comparisons or references to examples

    New Auto-Interp
    Negative Logits
    <bos>
    -0.77
     préfé
    -0.55
     debout
    -0.53
     appartiennent
    -0.52
     doulou
    -0.51
    styleType
    -0.50
     alimentaires
    -0.50
     unggul
    -0.50
     demais
    -0.50
     étoient
    -0.50
    POSITIVE LOGITS
     propOrder
    0.82
    tagext
    0.71
    __*/
    0.70
    avits
    0.69
     gyhoeddwyd
    0.69
    ]();
    0.69
    mption
    0.68
    AndroidJUnit
    0.68
     المعيارى
    0.68
    ="@+
    0.66
    Act Density 0.009%

    No Known Activations