INDEX
    Explanations

    failed action or inability

    New Auto-Interp
    Negative Logits
    ro
    0.63
    IP
    0.56
    Hoe
    0.53
    rific
    0.52
     Moving
    0.52
     Hunting
    0.52
    p
    0.52
    Planning
    0.51
    ythe
    0.51
    lus
    0.51
    POSITIVE LOGITS
     מ
    0.74
    ي
    0.67
    िंग
    0.67
     م
    0.64
     ص
    0.64
    0.61
     তিনি
    0.60
     ג
    0.59
    0.59
    0.59
    Act Density 0.014%

    No Known Activations