INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    119
    -0.07
     dolphins
    -0.07
    icult
    -0.07
     hardcore
    -0.07
    reeNode
    -0.06
     meta
    -0.06
    指数
    -0.06
    DEFINED
    -0.06
    -core
    -0.06
     хв
    -0.06
    POSITIVE LOGITS
     political
    0.08
     Politics
    0.07
     Political
    0.07
     Personally
    0.06
    flake
    0.06
     sefer
    0.06
     politics
    0.06
     politically
    0.06
     elected
    0.06
    АР
    0.06
    Act Density 0.020%

    No Known Activations