INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    |.
    -0.07
    zk
    -0.07
    ázd
    -0.06
    Inspect
    -0.06
    Invariant
    -0.06
    bole
    -0.06
    tır
    -0.06
    etxt
    -0.06
    _PROP
    -0.06
    ضی
    -0.06
    POSITIVE LOGITS
    	my
    0.08
    Poly
    0.07
    0.07
    0.06
     đồng
    0.06
     věc
    0.06
     undecided
    0.06
    ↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
    0.06
     flutter
    0.06
     '
    ↵
    0.06
    Act Density 0.003%

    No Known Activations