INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     carboxyl
    0.39
     wasp
    0.38
     μαγγ
    0.38
     handyman
    0.38
     doar
    0.37
    🗡
    0.37
     jugando
    0.36
     zmian
    0.36
     wheelchair
    0.36
     vinaig
    0.36
    POSITIVE LOGITS
    0.42
    i
    0.37
    import
    0.35
    ,
    0.35
    head
    0.34
    <0xE2>
    0.34
    dec
    0.34
    cas
    0.33
    loading
    0.33
    O
    0.33
    Act Density 0.004%

    No Known Activations