INDEX
    Explanations

    instances of significant numerical data or comparisons

    New Auto-Interp
    Negative Logits
     change
    -0.91
    change
    -0.80
     switch
    -0.78
     shift
    -0.78
     changement
    -0.76
    changed
    -0.74
    Changed
    -0.74
     CHANGE
    -0.72
     Change
    -0.71
     Shift
    -0.69
    POSITIVE LOGITS
    [toxicity=0]
    0.62
    +#+#
    0.57
    Xna
    0.57
    findpost
    0.56
     متعلقه
    0.55
    FormTagHelper
    0.53
    发表于
    0.52
    tafogo
    0.51
    saraba
    0.51
     الاطلاع
    0.49
    Act Density 0.045%

    No Known Activations