INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    outi
    -0.08
    łym
    -0.07
     surgiu
    -0.07
    /list
    -0.07
    yclic
    -0.07
    opy
    -0.07
    oló
    -0.07
    ansyon
    -0.07
    irmek
    -0.07
    yper
    -0.07
    POSITIVE LOGITS
     ν
    0.08
     φ
    0.08
     Ω
    0.08
     α
    0.08
     starred
    0.08
     gamma
    0.08
    _kel
    0.08
     γ
    0.08
    421
    0.08
     alpha
    0.08
    Act Density 0.009%

    No Known Activations