INDEX
    Explanations

    /projects and /

    New Auto-Interp
    Negative Logits
    /pop
    -0.07
     basin
    -0.07
    ôte
    -0.06
     whistleblower
    -0.06
     نشده
    -0.06
    _packages
    -0.06
    What
    -0.06
    OLL
    -0.06
    oS
    -0.06
    850
    -0.06
    POSITIVE LOGITS
     Persistent
    0.07
     schop
    0.06
     брон
    0.06
    apons
    0.06
     Бор
    0.06
     Electronics
    0.06
     muscular
    0.06
    agnet
    0.06
    eligible
    0.06
     dragging
    0.06
    Act Density 0.005%

    No Known Activations