INDEX
    Explanations

    Code/gibberish

    New Auto-Interp
    Negative Logits
    (node
    -0.07
    	cal
    -0.07
    [N
    -0.07
    ief
    -0.07
    nas
    -0.06
     dubna
    -0.06
    	put
    -0.06
     viele
    -0.06
    anni
    -0.06
     abolished
    -0.06
    POSITIVE LOGITS
    .SingleOrDefault
    0.07
    0.07
     niż
    0.07
    .discount
    0.06
    رق
    0.06
    0.06
    ắn
    0.06
    ้อม
    0.06
    πος
    0.06
    .target
    0.06
    Act Density 0.050%

    No Known Activations