INDEX
    Explanations

    personal experiences or reflections

    New Auto-Interp
    Negative Logits
    lation
    -0.75
    geon
    -0.73
    mund
    -0.69
    ardless
    -0.67
    math
    -0.64
    Dom
    -0.62
     srfAttach
    -0.62
    reality
    -0.61
    plant
    -0.60
     Wem
    -0.60
    POSITIVE LOGITS
     learnt
    0.99
     learned
    0.98
     wish
    0.98
     Learned
    0.90
     bucket
    0.84
     wanted
    0.81
     noticed
    0.80
     wished
    0.79
     dislike
    0.79
     forgot
    0.77
    Act Density 0.111%

    No Known Activations