INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Buy
    -0.07
    -0.07
    _inline
    -0.06
     Pare
    -0.06
     Nice
    -0.06
     dire
    -0.06
    rello
    -0.06
    Parse
    -0.06
    ıntı
    -0.06
    urse
    -0.06
    POSITIVE LOGITS
     freedom
    0.10
     Freedom
    0.10
    romise
    0.07
     Activ
    0.07
     Jacob
    0.07
    Freedom
    0.07
    rom
    0.07
     liberty
    0.07
    воб
    0.07
     destroyer
    0.06
    Act Density 0.014%

    No Known Activations