INDEX
    Explanations

    references to various degrees of emotional responses or sentiments

    New Auto-Interp
    Negative Logits
     Efq
    -0.91
    ῖς
    -0.83
    ReusableCell
    -0.80
    xodo
    -0.77
     للمعارف
    -0.77
    ecute
    -0.73
    وأضاف
    -0.72
    地说道
    -0.71
     Gutenberg
    -0.70
    TagMode
    -0.69
    POSITIVE LOGITS
     it
    0.70
    detectChanges
    0.63
     there
    0.61
     I
    0.58
    <eos>
    0.55
     we
    0.55
     they
    0.53
     then
    0.53
    AndEndTag
    0.51
     CreateTagHelper
    0.50
    Act Density 0.432%

    No Known Activations