INDEX
    Explanations

    references to calm environments or states, especially in relation to furniture or settings

    New Auto-Interp
    Negative Logits
     للمعارف
    -1.16
    <unused52>
    -1.09
    <unused3>
    -1.09
    <unused68>
    -1.09
    <unused8>
    -1.09
    [@BOS@]
    -1.09
    <unused16>
    -1.09
    <unused28>
    -1.09
    <unused14>
    -1.09
    <unused41>
    -1.08
    POSITIVE LOGITS
    3
    0.61
    ↵↵
    0.61
    #
    0.58
    2
    0.57
    1
    0.56
    OM
    0.53
    E
    0.52
    R
    0.52
    0.52
    p
    0.52
    Act Density 0.498%

    No Known Activations