INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    itele
    -0.09
    walker
    -0.08
    či
    -0.07
    gate
    -0.07
     phased
    -0.07
     gesk
    -0.07
    gi
    -0.07
    binder
    -0.07
    -0.07
     karşı
    -0.07
    POSITIVE LOGITS
    .toggle
    0.10
     Toggle
    0.10
    Toggle
    0.10
    _toggle
    0.09
     toggle
    0.09
     premi
    0.08
    .Boolean
    0.08
    Boolean
    0.08
     flags
    0.08
    -toggle
    0.08
    Act Density 0.009%

    No Known Activations