INDEX
Explanations
titles, names, or references to songs
mentions of the word "Song" in various contexts
New Auto-Interp
Negative Logits
orate
-0.81
rontal
-0.80
Kear
-0.77
Pradesh
-0.69
fitted
-0.64
etheless
-0.63
astical
-0.62
agons
-0.62
LAT
-0.60
remlin
-0.59
POSITIVE LOGITS
Song
1.45
Song
1.44
Songs
1.24
song
1.17
lyrics
1.15
writer
1.08
writers
1.04
bird
1.01
writing
0.99
lyric
0.98
Activations Density 0.006%