INDEX
    Explanations

    IG or SIG followed by common abbreviations

    New Auto-Interp
    Negative Logits
    1.89
    t
    1.72
    ی
    1.48
    1.27
    1.24
    ک
    1.22
    ۹
    1.16
    ری
    1.14
    وی
    1.13
    1.13
    POSITIVE LOGITS
    ur
    0.86
    0.79
    inda
    0.79
    يان
    0.79
     gameState
    0.79
    mäßig
    0.77
    مون
    0.76
     مي
    0.75
    اي
    0.71
    मधील
    0.71
    Act Density 0.001%

    No Known Activations