INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ی
    1.59
    i
    1.52
    מ
    1.45
     celebrity
    1.42
    ‌های
    1.38
    During
    1.37
    aggi
    1.33
    ामध्ये
    1.32
    1.32
    א
    1.31
    POSITIVE LOGITS
     reassured
    1.46
     desirable
    1.43
    ddots
    1.40
     sedent
    1.36
     操作
    1.36
     faithfulness
    1.35
    1.34
    ابر
    1.34
     bioavailability
    1.33
     unavoid
    1.32
    Act Density 0.192%

    No Known Activations