INDEX
    Explanations

    automatic actions and detection

    New Auto-Interp
    Negative Logits
    affirming
    0.33
     elucidation
    0.32
    0.32
     ندار
    0.32
     dobbiamo
    0.32
    ٰ
    0.32
     имели
    0.31
    ೀರಿ
    0.30
    0.30
     ছিলেন
    0.30
    POSITIVE LOGITS
     automatically
    1.28
     автоматически
    1.11
    自动
    1.01
    automatically
    1.00
     detects
    0.98
     자동으로
    0.97
     automáticamente
    0.96
     Automatically
    0.96
     automatiquement
    0.96
    Automatically
    0.95
    Act Density 0.235%

    No Known Activations