INDEX
    Explanations

    references to criticism and public backlash

    New Auto-Interp
    Negative Logits
    alez
    -0.17
    IFn
    -0.15
    idth
    -0.15
    bjerg
    -0.14
     blat
    -0.14
     ấn
    -0.13
    oss
    -0.13
    ysl
    -0.13
    aze
    -0.13
    tron
    -0.13
    POSITIVE LOGITS
     directed
    0.29
     online
    0.25
     hur
    0.22
    Directed
    0.21
     leveled
    0.21
     aimed
    0.21
     lev
    0.21
     level
    0.20
     voices
    0.20
     Directed
    0.19
    Act Density 0.177%

    No Known Activations