INDEX
    Explanations

    references to the number six and related quantities

    New Auto-Interp
    Negative Logits
    aren
    -0.17
    led
    -0.16
    ont
    -0.16
    ived
    -0.16
    ishments
    -0.15
    aira
    -0.15
    iams
    -0.14
    558
    -0.14
    ports
    -0.14
    ajan
    -0.14
    POSITIVE LOGITS
    teenth
    0.32
    ties
    0.28
    teen
    0.26
    ti
    0.24
    ty
    0.23
    ãģ¤ãģ®
    0.20
    th
    0.19
    -figure
    0.19
    sense
    0.19
    ï¸ı
    0.19
    Act Density 0.080%

    No Known Activations