INDEX
    Explanations

    mentions of the name "Dave"

    New Auto-Interp
    Negative Logits
    人çī©
    -0.18
    jit
    -0.16
    loh
    -0.15
    'gc
    -0.14
    rish
    -0.14
    orners
    -0.14
    ecture
    -0.14
    ulu
    -0.14
    Ñħод
    -0.13
    _MAN
    -0.13
    POSITIVE LOGITS
    y
    0.30
    igh
    0.23
    ed
    0.19
    yh
    0.17
    IGH
    0.16
    edar
    0.16
    ÙĬد
    0.16
    amer
    0.15
    eder
    0.15
    yaw
    0.15
    Act Density 0.005%

    No Known Activations