INDEX
    Explanations

    Dragon followed by specific nouns

    New Auto-Interp
    Negative Logits
    .:
    0.43
     Stef
    0.40
    0.40
    парта
    0.38
     орке
    0.38
    이에
    0.36
     ঢাকার
    0.36
    rive
    0.35
     உரிமை
    0.35
    0.35
    POSITIVE LOGITS
    🐲
    1.23
    🐉
    1.21
     dragon
    1.16
     dragons
    1.10
     Dragon
    1.07
    Dragon
    1.07
     Dragons
    1.02
    0.95
    dragon
    0.95
    0.93
    Act Density 0.007%

    No Known Activations