INDEX
    Explanations

    best, most extreme

    New Auto-Interp
    Negative Logits
     magnitude
    -0.10
    远远
    -0.08
    -0.08
    -0.08
    -0.07
    -0.07
     odd
    -0.07
    陌生
    -0.07
     سورية
    -0.07
     Rarity
    -0.06
    POSITIVE LOGITS
    Student
    0.07
     mediaPlayer
    0.07
    說話
    0.07
    UsageId
    0.07
    0.07
    swiper
    0.07
    抱团
    0.06
    циальн
    0.06
    Tabla
    0.06
     SERIES
    0.06
    Act Density 0.051%

    No Known Activations