INDEX
Explanations
words related to bird species
references to birds
New Auto-Interp
Negative Logits
FINE
-0.80
NRS
-0.70
gur
-0.68
ilitary
-0.67
idential
-0.65
PLIED
-0.64
idates
-0.64
imaru
-0.63
oldemort
-0.63
Wr
-0.63
POSITIVE LOGITS
bird
1.14
birds
1.04
irds
0.99
Birds
0.99
birds
0.96
seed
0.96
bird
0.90
owl
0.90
species
0.90
hawk
0.90
Activations Density 0.008%