INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     گاه
    -0.07
     آیا
    -0.07
    ुभ
    -0.06
    pond
    -0.06
    ่ย
    -0.06
     imposing
    -0.06
     cave
    -0.06
     varlık
    -0.06
    -0.06
    heim
    -0.06
    POSITIVE LOGITS
     Motor
    0.11
    Motor
    0.10
     motor
    0.10
     MOTOR
    0.09
     motors
    0.09
    motor
    0.08
     Stafford
    0.07
     moto
    0.07
    0.07
    HP
    0.07
    Act Density 0.012%

    No Known Activations