INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     soldier
    -0.06
     работы
    -0.06
    .ACT
    -0.06
     soldiers
    -0.06
    ORIZATION
    -0.06
    -0.06
    bero
    -0.06
     veterinary
    -0.06
    Helper
    -0.06
     nuanced
    -0.06
    POSITIVE LOGITS
    ئت
    0.07
     Vogue
    0.07
     Electron
    0.06
    imon
    0.06
     Gala
    0.06
     MR
    0.06
    0.06
     continues
    0.06
     Vinyl
    0.06
    -loop
    0.06
    Act Density 0.020%

    No Known Activations