INDEX
    Explanations

    nouns related to essential human concepts and entities

    New Auto-Interp
    Negative Logits
    ppat
    -1.80
     \[*
    -1.69
     âĢķ
    -1.67
     **[
    -1.60
    pgen
    -1.49
    dom
    -1.46
    -1.43
     ([**
    -1.40
    -1.39
    pntd
    -1.35
    POSITIVE LOGITS
    «
    3.10
    ĻĤ
    2.98
    ł
    2.97
    IJ
    2.91
    Ĩ
    2.81
    »¿
    2.80
    Ī
    2.79
    Īĺ
    2.77
    ¦
    2.77
    ĸ
    2.73
    Act Density 0.051%

    No Known Activations