INDEX
Explanations
references to elephants in various contexts
elephant and mammoth
New Auto-Interp
Negative Logits
prawda
-0.42
liggen
-0.41
area
-0.40
area
-0.40
ratio
-0.39
outono
-0.39
ywna
-0.38
írito
-0.38
glBind
-0.38
bias
-0.37
POSITIVE LOGITS
elephant
1.36
Elephant
1.35
Elephant
1.34
elephants
1.34
Elephants
1.23
elephant
1.20
phants
1.09
elefante
1.07
elef
1.00
🐘
0.82
Activations Density 0.004%