INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ابتد
    -0.07
     Semester
    -0.07
     зат
    -0.06
     liber
    -0.06
     Ku
    -0.06
     kus
    -0.06
    ी-
    -0.06
    .min
    -0.06
    -cert
    -0.06
     Monica
    -0.06
    POSITIVE LOGITS
    Refer
    0.07
    REAL
    0.07
    COMM
    0.06
    0.06
    Sending
    0.06
    ाइ
    0.06
     tông
    0.06
    Ρ
    0.06
     مبانی
    0.06
    TEM
    0.06
    Act Density 0.001%

    No Known Activations