INDEX
    Explanations

    references to understanding and comprehension in personal contexts

    New Auto-Interp
    Negative Logits
    ì¹
    -0.17
    hma
    -0.15
    tell
    -0.15
    heet
    -0.14
    HING
    -0.14
    ÙħاÙħ
    -0.14
    adam
    -0.14
    553
    -0.14
    kart
    -0.14
    robot
    -0.14
    POSITIVE LOGITS
     understanding
    0.21
     understand
    0.20
     Understanding
    0.19
     understands
    0.18
     fully
    0.17
     why
    0.16
     comprehension
    0.16
    Understanding
    0.16
     comprend
    0.16
     abstract
    0.15
    Act Density 0.246%

    No Known Activations