INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _btn
    -0.07
     Theory
    -0.07
     Menschen
    -0.06
     такого
    -0.06
     Component
    -0.06
    -0.06
     charity
    -0.06
     Biography
    -0.06
    -bg
    -0.06
     flavours
    -0.06
    POSITIVE LOGITS
     ];
    0.07
     master
    0.07
     dra
    0.07
     Cron
    0.07
     feder
    0.07
    .drawer
    0.06
    napshot
    0.06
     Sgt
    0.06
    0.06
    ιστο
    0.06
    Act Density 0.004%

    No Known Activations