INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     constructed
    -0.07
    struct
    -0.07
    _trap
    -0.07
    .re
    -0.07
     ren
    -0.07
    _CREATED
    -0.07
    _currency
    -0.07
     tense
    -0.07
     reconstructed
    -0.07
    Completion
    -0.07
    POSITIVE LOGITS
     obvious
    0.20
     obviously
    0.15
     Obviously
    0.13
    Obviously
    0.12
     очевид
    0.10
    wig
    0.08
    vious
    0.07
     glaring
    0.07
    Msp
    0.06
     BF
    0.06
    Act Density 0.006%

    No Known Activations