INDEX
    Explanations

    references to a specific name or entity

    mentions of the word "Der"

    New Auto-Interp
    Negative Logits
     Samoa
    -0.67
    INTER
    -0.66
     Elephant
    -0.65
    odcast
    -0.63
    toggle
    -0.62
     [|
    -0.62
    hetti
    -0.61
    ZI
    -0.61
     Fenrir
    -0.60
    ships
    -0.59
    POSITIVE LOGITS
    ricks
    1.08
    bys
    1.04
    ivation
    0.99
    ived
    0.87
    bil
    0.87
    rick
    0.86
    ription
    0.84
     Spiegel
    0.84
    rek
    0.82
    cliffe
    0.81
    Act Density 0.021%

    No Known Activations