INDEX
    Explanations

    references to dragons and their characteristics

    New Auto-Interp
    Negative Logits
     rat
    -0.17
    opher
    -0.15
     spiders
    -0.15
     tas
    -0.15
    ãĥ¼ãĤ¸
    -0.15
     ware
    -0.14
    ocup
    -0.14
     ris
    -0.14
    FromArray
    -0.14
     rabbit
    -0.14
    POSITIVE LOGITS
     dragon
    0.27
    dragon
    0.26
     dragons
    0.25
    ragon
    0.25
     Dragon
    0.24
     é¾
    0.23
     Dragons
    0.23
    Dragon
    0.22
    é¾į
    0.20
     dracon
    0.19
    Act Density 0.030%

    No Known Activations