INDEX
Explanations
mentions of birds
references to birds
New Auto-Interp
Negative Logits
idency
-0.93
ushima
-0.77
andem
-0.76
idential
-0.75
FINE
-0.74
andro
-0.72
arios
-0.71
Ces
-0.71
unregulated
-0.69
ilde
-0.69
POSITIVE LOGITS
bird
1.43
birds
1.21
bird
1.12
Bird
1.06
Bird
0.99
hawk
0.96
owl
0.93
bats
0.92
Birds
0.90
birds
0.89
Activations Density 0.007%