INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Burst
    -0.08
    ;border
    -0.08
     reassurance
    -0.08
    holz
    -0.07
     esmal
    -0.07
     border
    -0.07
     sudden
    -0.07
    ;color
    -0.07
     Sandwich
    -0.07
     distal
    -0.07
    POSITIVE LOGITS
     timestep
    0.11
     развити
    0.09
    imestep
    0.09
    .tick
    0.09
     avanço
    0.09
    updates
    0.09
    advance
    0.09
    .update
    0.09
    .Update
    0.09
     advancing
    0.09
    Act Density 0.005%

    No Known Activations