INDEX
    Explanations

    symbols, numbers, and mathematical operations

    New Auto-Interp
    Negative Logits
     ob
    -0.24
     Ob
    -0.21
    Âłob
    -0.18
    -ob
    -0.18
    _ob
    -0.17
    Ob
    -0.17
    OB
    -0.16
    ifu
    -0.16
    šov
    -0.16
     Kushner
    -0.16
    POSITIVE LOGITS
    182
    0.33
     obvious
    0.33
    183
    0.28
    181
    0.28
     apparent
    0.27
     evident
    0.27
    180
    0.26
    982
    0.24
    282
    0.24
    581
    0.23
    Act Density 0.038%

    No Known Activations