INDEX
    Explanations

    instances of the word "know" and its variations

    New Auto-Interp
    Negative Logits
    hread
    -0.16
    cola
    -0.16
    ãĥ¼ãĥĢ
    -0.15
    ural
    -0.14
    аÑĢод
    -0.14
    wizard
    -0.14
    shaw
    -0.14
    imenti
    -0.14
    apo
    -0.14
    min
    -0.14
    POSITIVE LOGITS
    ledge
    0.21
    upp
    0.21
    -how
    0.20
    ledged
    0.19
    liness
    0.18
    estar
    0.16
    estic
    0.16
    alg
    0.15
    lobber
    0.15
    erver
    0.15
    Act Density 0.142%

    No Known Activations