INDEX
    Explanations

    references to dogs in various contexts

    New Auto-Interp
    Negative Logits
     Camb
    -0.92
     الحره
    -0.91
     Rial
    -0.91
    andExpect
    -0.91
     Hillsborough
    -0.86
     Transparency
    -0.85
     Thon
    -0.83
     Brice
    -0.82
     Temples
    -0.82
     pinn
    -0.81
    POSITIVE LOGITS
     dogs
    1.55
     Dog
    1.41
     dog
    1.41
     Dogs
    1.39
     DOG
    1.35
    Dogs
    1.31
    Dog
    1.30
     DOGS
    1.24
    dogs
    1.18
    dog
    1.15
    Act Density 0.224%

    No Known Activations