INDEX
Explanations
references to dogs and their associated behaviors or environments
New Auto-Interp
Negative Logits
Rial
-0.94
Arcadia
-0.87
الحره
-0.87
Eſ
-0.82
Temples
-0.82
Transparency
-0.81
Dami
-0.79
Miri
-0.79
myſelf
-0.78
Bronnen
-0.78
POSITIVE LOGITS
Dog
1.86
dogs
1.84
dog
1.84
Dog
1.79
Dogs
1.70
DOG
1.70
dog
1.61
Dogs
1.58
DOG
1.50
DOGS
1.48
Activations Density 0.021%