INDEX
    Explanations

    mentions of the word "dog" in various forms and contexts

    New Auto-Interp
    Negative Logits
     autorytatywna
    -0.59
    umi
    -0.50
    mira
    -0.50
    umina
    -0.47
     AMI
    -0.47
    pert
    -0.47
    versa
    -0.46
    ini
    -0.46
     Rasa
    -0.45
    LLI
    -0.45
    POSITIVE LOGITS
    Dog
    2.17
    dog
    2.17
     Dog
    2.14
     DOG
    1.88
     dog
    1.74
    DOG
    1.74
    dogs
    1.47
     Dogs
    1.46
     dogs
    1.30
    Dogs
    1.28
    Act Density 0.006%

    No Known Activations