INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .color
    -0.07
    _da
    -0.07
    ěk
    -0.06
     tích
    -0.06
     ion
    -0.06
    ibilidade
    -0.06
     وزن
    -0.06
    .pag
    -0.06
     فنی
    -0.06
    _star
    -0.06
    POSITIVE LOGITS
     vuel
    0.06
     taxpayer
    0.06
     LINEAR
    0.06
     contractor
    0.06
    .AutoScale
    0.06
     estates
    0.06
    unct
    0.06
     concerned
    0.06
    ",[
    0.06
    --)
    ↵
    0.06
    Act Density 0.002%

    No Known Activations