INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    σ
    -0.08
    -temp
    -0.07
     Across
    -0.06
    علوم
    -0.06
     Winner
    -0.06
    -0.06
     As
    -0.06
    *v
    -0.06
    -0.06
    pcf
    -0.06
    POSITIVE LOGITS
     comrades
    0.07
     observing
    0.07
     [<
    0.07
     composition
    0.06
     Ле
    0.06
    pill
    0.06
     RTC
    0.06
    0.06
     입니다
    0.06
    0.06
    Act Density 0.000%

    No Known Activations