INDEX
    Explanations

    mentions of Twitter handles and social media references

    New Auto-Interp
    Negative Logits
    DeleteBehavior
    -0.86
    انيف
    -0.75
    Obrázky
    -0.74
    <()>
    -0.73
    migrationBuilder
    -0.72
     חיצוניים
    -0.71
    ^(@)
    -0.69
    writeFieldEnd
    -0.68
    endpush
    -0.66
     kasarigan
    -0.64
    POSITIVE LOGITS
     @
    0.81
    @
    0.56
     (@
    0.52
    enumi
    0.52
     paraly
    0.48
    󠁢
    0.48
    ,@
    0.47
     @__
    0.47
     @_
    0.46
    PAID
    0.45
    Act Density 0.205%

    No Known Activations