INDEX
    Explanations

    proper nouns, particularly names

    New Auto-Interp
    Negative Logits
    baum
    -0.18
    rade
    -0.16
    ria
    -0.15
    bef
    -0.15
    ιαÏĤ
    -0.15
    ni
    -0.15
    senal
    -0.14
     gezocht
    -0.14
    ooks
    -0.14
    bart
    -0.14
    POSITIVE LOGITS
    ise
    0.29
    vre
    0.22
    isa
    0.20
    nger
    0.20
    ie
    0.19
    Lou
    0.19
    verture
    0.18
    igi
    0.18
    loud
    0.18
    ette
    0.17
    Act Density 0.006%

    No Known Activations