INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     وبين
    -0.09
     Ley
    -0.09
     სიმ
    -0.08
     VD
    -0.08
     gim
    -0.08
    -0.08
     twenties
    -0.08
    utang
    -0.08
     certe
    -0.08
     دول
    -0.08
    POSITIVE LOGITS
    abi
    0.07
    0.07
     incorpor
    0.07
    Food
    0.07
    Advertising
    0.07
     Kafka
    0.07
     Advertising
    0.07
    广告
    0.07
     adhered
    0.07
     advertising
    0.07
    Act Density 0.001%

    No Known Activations