INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    jug
    0.38
    bomb
    0.36
    ियमित
    0.36
    0.35
     جگ
    0.35
    Bomb
    0.35
     نکل
    0.35
    Coming
    0.34
    agedy
    0.34
    Parallel
    0.34
    POSITIVE LOGITS
     κ
    0.40
     captions
    0.37
     Monster
    0.36
     Zhong
    0.36
     Mensch
    0.36
     podmín
    0.36
     mini
    0.35
     monospace
    0.35
     Centers
    0.34
    мень
    0.34
    Act Density 0.002%

    No Known Activations