INDEX
    Explanations

    punctuation marks indicating questions or dramatic pauses

    New Auto-Interp
    Negative Logits
    Locator
    -0.15
    hti
    -0.14
    εια
    -0.14
    erence
    -0.14
    구
    -0.14
     herself
    -0.14
    vek
    -0.13
    åīĽ
    -0.13
    fern
    -0.13
    anca
    -0.13
    POSITIVE LOGITS
     I
    0.30
     we
    0.28
     they
    0.20
     We
    0.19
     none
    0.18
    I
    0.18
     there
    0.18
     inval
    0.17
    æĪij
    0.16
    our
    0.16
    Act Density 0.001%

    No Known Activations