INDEX
Explanations
animals for which specific nouns are commonly used
animal names and terms related to wildlife
New Auto-Interp
Negative Logits
challeng
-0.90
tiss
-0.78
cryst
-0.78
proport
-0.77
referen
-0.76
ngth
-0.74
arrang
-0.71
Palestin
-0.71
nce
-0.70
Ambro
-0.70
POSITIVE LOGITS
hawks
1.09
eye
1.00
carc
0.92
hawk
0.91
flies
0.87
Redditor
0.87
fish
0.87
beetle
0.87
beetles
0.87
birds
0.85
Activations Density 0.226%