INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Slot
    -0.11
    Slot
    -0.09
    slot
    -0.09
    Slots
    -0.09
    _slot
    -0.09
    _slots
    -0.08
    slots
    -0.08
     слот
    -0.08
    .slot
    -0.08
     slots
    -0.08
    POSITIVE LOGITS
     across
    0.11
    一致
    0.10
     consistency
    0.10
    Consistency
    0.10
     Across
    0.09
     synchronize
    0.08
     référ
    0.08
    同步
    0.08
    Across
    0.08
     borr
    0.08
    Act Density 0.012%

    No Known Activations