INDEX
    Explanations

    occurrences of the word "listen" and its variations

    New Auto-Interp
    Negative Logits
    piler
    -0.18
    gro
    -0.16
    anner
    -0.16
    ouncy
    -0.15
    ãĥ³ãĤ¬
    -0.15
    ikat
    -0.15
    uman
    -0.15
    dk
    -0.15
    arty
    -0.15
    licht
    -0.14
    POSITIVE LOGITS
    /watch
    0.32
    ening
    0.28
    /read
    0.26
     closely
    0.25
     int
    0.25
    AndServe
    0.25
     attent
    0.24
    /view
    0.24
     carefully
    0.21
    ning
    0.20
    Act Density 0.022%

    No Known Activations