INDEX
    Explanations

    phrases indicating intention or desire

    New Auto-Interp
    Negative Logits
     realise
    -0.14
    upon
    -0.14
    _refl
    -0.14
     recogn
    -0.14
    uxe
    -0.14
     Gam
    -0.14
    usterity
    -0.14
    fter
    -0.13
     Decide
    -0.13
    jk
    -0.13
    POSITIVE LOGITS
     know
    0.28
     hear
    0.23
     knows
    0.22
     hearing
    0.22
     Know
    0.21
    -know
    0.21
    çŁ¥éģĵ
    0.19
    Know
    0.19
    оÑģÑĮ
    0.19
    know
    0.18
    Act Density 0.150%

    No Known Activations