INDEX
    Explanations

    instances of the word "but" in various contexts

    New Auto-Interp
    Negative Logits
    ibling
    -0.17
    iliar
    -0.17
    ï¸ı
    -0.17
    avou
    -0.16
    imagenes
    -0.16
    ippets
    -0.15
    ippet
    -0.15
    locate
    -0.15
    peare
    -0.15
    apolis
    -0.14
    POSITIVE LOGITS
    ters
    0.39
    chers
    0.38
    term
    0.38
    tern
    0.35
    cher
    0.35
    lers
    0.31
    ts
    0.30
    ch
    0.29
    tery
    0.29
    ler
    0.28
    Act Density 0.032%

    No Known Activations