INDEX
    Explanations

    expressions of surprise or unanticipated experiences

    New Auto-Interp
    Negative Logits
    pillar
    -0.17
    ambi
    -0.15
    eah
    -0.15
    aleb
    -0.15
    istingu
    -0.14
    emek
    -0.14
    ocker
    -0.14
    κολ
    -0.14
    oyal
    -0.13
    Trust
    -0.13
    POSITIVE LOGITS
     otherwise
    0.29
     dream
    0.28
     dreamed
    0.25
     Dream
    0.24
     previously
    0.24
    dream
    0.24
     even
    0.23
    otherwise
    0.23
    Dream
    0.22
    梦
    0.22
    Act Density 0.104%

    No Known Activations