INDEX
    Explanations

    proper nouns, specifically names of people and organizations

    New Auto-Interp
    Negative Logits
    laughter
    -0.15
    zsche
    -0.15
     بش
    -0.14
    ibase
    -0.14
    mmas
    -0.14
     descriptions
    -0.13
     praise
    -0.13
     description
    -0.13
    iset
    -0.13
    iface
    -0.13
    POSITIVE LOGITS
     told
    0.46
     tells
    0.44
     tell
    0.40
    Âłt
    0.39
     telling
    0.36
    åijĬè¯ī
    0.33
    tell
    0.32
    Tell
    0.32
     Tells
    0.30
     Tell
    0.30
    Act Density 0.072%

    No Known Activations