INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     GenerationType
    -0.72
     يتيمه
    -0.65
    ItemBackground
    -0.63
    retweeted
    -0.61
     jsPsych
    -0.61
    nodoc
    -0.60
     informée
    -0.58
    parsedMessage
    -0.58
    titleMargin
    -0.58
    protoimpl
    -0.57
    POSITIVE LOGITS
     Contrary
    0.65
     contrary
    0.64
    0.60
    Contrary
    0.58
     Other
    0.57
     Throughout
    0.57
    contr
    0.56
     Filled
    0.56
     Given
    0.55
     Following
    0.53
    Act Density 0.016%

    No Known Activations