INDEX
    Explanations

    instances of self-reflection and acknowledgment in personal experiences

    New Auto-Interp
    Negative Logits
    en
    -0.52
     on
    -0.51
    -0.50
    th
    -0.50
    бок
    -0.50
    mb
    -0.49
    t
    -0.49
    <eos>
    -0.49
     h
    -0.49
    -0.48
    POSITIVE LOGITS
     admit
    1.31
     approve
    1.05
     myſelf
    1.03
     accept
    1.03
     acknowledge
    1.01
     recognise
    1.00
     Efq
    1.00
     itſelf
    1.00
     recognize
    0.99
     agree
    0.96
    Act Density 0.129%

    No Known Activations