INDEX
    Explanations

    capital letters or specific names and titles

    New Auto-Interp
    Negative Logits
    elin
    -0.20
    cx
    -0.18
    cxx
    -0.18
    ocs
    -0.17
    elas
    -0.16
    amax
    -0.15
    ec
    -0.15
    ears
    -0.15
    SCII
    -0.15
     Bird
    -0.15
    POSITIVE LOGITS
    yo
    0.31
    gun
    0.30
    jez
    0.27
    duk
    0.25
    ndo
    0.25
    wo
    0.25
    dog
    0.24
    lor
    0.24
    kit
    0.24
    du
    0.24
    Act Density 0.010%

    No Known Activations