INDEX
    Explanations

    requests for explanation or clarification in discourse

    New Auto-Interp
    Negative Logits
    "]);
    
    -0.74
     DateFormat
    -0.66
     propOrder
    -0.65
    aarrggbb
    -0.62
     Où
    -0.61
    şört
    -0.61
     lengthen
    -0.60
    displayquote
    -0.60
     pylint
    -0.59
    قایناق‌لار
    -0.59
    POSITIVE LOGITS
     people
    0.86
     People
    0.84
    People
    0.77
     PEOPLE
    0.76
     ppl
    0.70
    people
    0.69
     ludzi
    0.68
     pessoas
    0.66
     somebody
    0.66
     Somebody
    0.65
    Act Density 0.314%

    No Known Activations