INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kolor
    -0.08
    (Vertex
    -0.08
    edly
    -0.08
    terminal
    -0.08
    ,还有
    -0.07
    反馈
    -0.07
    idente
    -0.07
    表示
    -0.07
     anonym
    -0.07
     Wik
    -0.07
    POSITIVE LOGITS
    Allocation
    0.12
     Allocation
    0.11
     allocation
    0.10
     전략
    0.09
    allocation
    0.09
     रणनी
    0.09
    .alloc
    0.09
     strategist
    0.09
    Placement
    0.09
     размещ
    0.09
    Act Density 0.004%

    No Known Activations