INDEX
    Explanations

    advantages/strengths

    New Auto-Interp
    Negative Logits
    -0.09
     damping
    -0.08
     doping
    -0.07
    家电
    -0.07
    -0.07
    言语
    -0.06
    -0.06
     Appropri
    -0.06
    سلح
    -0.06
     Speedway
    -0.06
    POSITIVE LOGITS
    까지
    0.07
    Donate
    0.07
    0.07
    0.07
    Quarter
    0.07
    precio
    0.07
    今まで
    0.07
     founders
    0.07
     Lee
    0.07
     instantly
    0.06
    Act Density 0.050%

    No Known Activations