INDEX
    Explanations

    phrases related to planning, discussion, and reflection

    sentences expressing collective thoughts or plans

    New Auto-Interp
    Negative Logits
    lation
    -0.70
     Kills
    -0.68
    reality
    -0.67
     srfAttach
    -0.65
    ardless
    -0.62
     Nah
    -0.62
     Sheen
    -0.62
    itary
    -0.61
     Fill
    -0.60
     Cros
    -0.60
    POSITIVE LOGITS
     wish
    0.99
     disliked
    0.92
     wished
    0.91
     wanted
    0.90
     regret
    0.89
     dislike
    0.89
     forgot
    0.88
     overlooked
    0.85
     learned
    0.85
     learnt
    0.84
    Act Density 0.169%

    No Known Activations