INDEX
    Explanations

    adverbs that indicate certainty or frequency

    New Auto-Interp
    Negative Logits
     is
    -0.69
     was
    -0.57
     are
    -0.57
     were
    -0.48
     themſelves
    -0.39
     will
    -0.35
     هو
    -0.35
    -0.34
     هي
    -0.33
    '
    -0.32
    POSITIVE LOGITS
    AddTagHelper
    0.61
     belonged
    0.59
     seemed
    0.55
    DeleteBehavior
    0.55
     been
    0.55
     emailAlready
    0.55
    الحياه
    0.54
     existed
    0.53
    __*/
    0.52
     Infórmanos
    0.52
    Act Density 0.372%

    No Known Activations