INDEX
    Explanations

    phrases indicating knowledge or expertise

    New Auto-Interp
    Negative Logits
    zimmer
    -0.16
    bil
    -0.16
    uentes
    -0.15
    nip
    -0.15
     Boh
    -0.15
    ibling
    -0.14
    foon
    -0.14
    ythe
    -0.14
    jerne
    -0.14
     NOTICE
    -0.14
    POSITIVE LOGITS
    ledged
    0.23
    ingly
    0.23
    ledge
    0.22
     ledge
    0.20
    -how
    0.19
    ings
    0.19
    edges
    0.19
    lesi
    0.19
    ance
    0.19
    les
    0.19
    Act Density 0.013%

    No Known Activations