INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    gy
    -0.20
    ton
    -0.17
    hou
    -0.16
    dge
    -0.16
    BarItem
    -0.16
    lessness
    -0.15
    cz
    -0.15
    dy
    -0.15
    arta
    -0.15
    çĵľ
    -0.14
    POSITIVE LOGITS
     kernels
    0.31
     Kernel
    0.28
     cob
    0.28
    pone
    0.27
     kernel
    0.27
    elian
    0.26
     Cob
    0.26
    stalk
    0.25
    hus
    0.25
    bread
    0.24
    Act Density 0.008%

    No Known Activations