INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.51
    beqz
    0.47
    積極的に
    0.46
    Secondo
    0.46
    ².
    0.45
     ejecut
    0.45
     cedo
    0.45
     अन्यथा
    0.44
     dengan
    0.44
    工作室
    0.44
    POSITIVE LOGITS
    alne
    0.45
    State
    0.42
     kolejne
    0.42
    Status
    0.41
    باد
    0.39
    Location
    0.39
    atae
    0.38
    бята
    0.38
    arynx
    0.38
    uola
    0.38
    Act Density 0.003%

    No Known Activations