INDEX
Explanations
references to birds and bird-related activities
New Auto-Interp
Negative Logits
ucch
-0.18
ecer
-0.17
à¸Ĺาà¸Ļ
-0.16
itzer
-0.16
aurus
-0.15
embr
-0.15
stances
-0.15
ayers
-0.15
adoras
-0.15
iske
-0.15
POSITIVE LOGITS
sey
0.30
song
0.25
nest
0.25
ie
0.24
Nest
0.23
cage
0.22
nest
0.21
shit
0.21
Cage
0.20
seed
0.20
Activations Density 0.011%