INDEX
Explanations
articles or descriptors for animals
New Auto-Interp
Negative Logits
Penh
0.67
愴
0.66
CVD
0.63
国务
0.62
vidé
0.60
Soder
0.60
欳
0.60
கதா
0.59
गरे
0.58
隰
0.57
POSITIVE LOGITS
tiger
2.22
owl
2.18
shark
2.17
rabbit
2.12
squirrel
2.09
wolf
2.06
cheetah
2.05
turtle
1.99
eagle
1.98
dolphin
1.98
Activations Density 0.240%