INDEX
    Explanations

    occurrences of the word "Col" along with related discussion points

    New Auto-Interp
    Negative Logits
    ehler
    -0.17
    ÅĻiv
    -0.15
    zes
    -0.15
    esk
    -0.15
    ega
    -0.14
    ead
    -0.14
    egend
    -0.14
    idebar
    -0.14
    unfold
    -0.14
    .Ptr
    -0.13
    POSITIVE LOGITS
    lier
    0.35
    leen
    0.34
    fax
    0.31
    lette
    0.30
    ombo
    0.29
    liers
    0.28
    burn
    0.28
    chester
    0.27
    borne
    0.27
    vin
    0.26
    Act Density 0.010%

    No Known Activations