INDEX
    Explanations

    words associated with confinement or restrictions

    New Auto-Interp
    Negative Logits
    .Builder
    -0.15
    amura
    -0.15
    (Buffer
    -0.15
    deaux
    -0.14
    (Bundle
    -0.14
    /block
    -0.14
    mgr
    -0.14
    mund
    -0.14
    exact
    -0.13
    tries
    -0.13
    POSITIVE LOGITS
    bs
    0.67
    b
    0.67
    ba
    0.54
    bed
    0.54
    б
    0.50
    be
    0.49
    bd
    0.49
    bb
    0.49
    bc
    0.48
    bin
    0.48
    Act Density 0.148%

    No Known Activations