INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tục
    -0.08
     penalties
    -0.08
    дум
    -0.07
    zioni
    -0.07
    ций
    -0.07
    -0.07
     songwriting
    -0.07
    ână
    -0.07
    orsz
    -0.07
     viscosity
    -0.07
    POSITIVE LOGITS
    Centered
    0.13
    .center
    0.13
     centered
    0.12
    :center
    0.12
    	center
    0.11
    _center
    0.11
    _CENTER
    0.11
    .Center
    0.11
    .CENTER
    0.11
    -center
    0.11
    Act Density 0.009%

    No Known Activations