INDEX
    Explanations

    negative statements or expressions of doubt

    New Auto-Interp
    Negative Logits
     propOrder
    -0.76
     autorytatywna
    -0.67
     AssemblyCulture
    -0.65
    EDEFAULT
    -0.64
    ArrowToggle
    -0.61
    хьтан
    -0.57
     ffilmiau
    -0.57
    aarrggbb
    -0.56
    UIControlState
    -0.56
     Italijani
    -0.56
    POSITIVE LOGITS
    [toxicity=0]
    0.82
    Q
    0.66
     Q
    0.49
    <
    0.48
    </blockquote>
    0.47
    [
    0.46
    Hope
    0.45
    <strong>
    0.45
      
    0.44
    toxicity
    0.44
    Act Density 1.125%

    No Known Activations