INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    tracking
    -0.08
    kle
    -0.07
     extent
    -0.07
    arak
    -0.07
     kadar
    -0.06
     bölüm
    -0.06
    -0.06
     turret
    -0.06
    -0.06
    бас
    -0.06
    POSITIVE LOGITS
    -heavy
    0.07
     Scre
    0.07
    0.07
    0.06
     Conscious
    0.06
    \Mapping
    0.06
    _SD
    0.06
    <()>
    0.06
     reinforcing
    0.06
     thị
    0.06
    Act Density 0.005%

    No Known Activations