INDEX
    Explanations

    comments and engagement indicators in texts

    New Auto-Interp
    Negative Logits
    LookAnd
    -0.87
     tartalomajánló
    -0.85
     MainAxisSize
    -0.79
     pleaſure
    -0.75
     itſelf
    -0.73
     greateſt
    -0.72
     RSSSF
    -0.70
     neceff
    -0.70
     Efq
    -0.69
     Jefus
    -0.69
    POSITIVE LOGITS
    مصادر
    0.57
    aarrggbb
    0.52
    <eos>
    0.49
    0.48
    0.48
    imod
    0.47
    0.46
    ↵↵
    0.45
    PROTOBUF
    0.45
    </h2>
    0.44
    Act Density 0.522%

    No Known Activations