INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ijkstra
    -0.09
     undo
    -0.09
     hã
    -0.09
    rella
    -0.09
     relevance
    -0.08
    ipay
    -0.08
    conc
    -0.08
     trophy
    -0.08
    ackbar
    -0.08
     relev
    -0.08
    POSITIVE LOGITS
     consequences
    0.32
     consequence
    0.24
     dire
    0.21
     effects
    0.19
    sequences
    0.17
    ÑģÑĤвиÑı
    0.17
    dire
    0.17
     negative
    0.16
     serious
    0.16
     наÑģлÑĸд
    0.15
    Act Density 0.047%

    No Known Activations