INDEX
Explanations
references to music genres, specifically metal
New Auto-Interp
Negative Logits
pop
-0.19
steller
-0.16
Pop
-0.16
POP
-0.15
inand
-0.15
-pop
-0.15
uncert
-0.15
_dash
-0.14
ricks
-0.14
æĬµ
-0.14
POSITIVE LOGITS
metal
0.33
Metal
0.33
metal
0.32
Metal
0.31
metall
0.28
-metal
0.25
metals
0.24
Metals
0.23
Metallic
0.23
меÑĤал
0.23
Activations Density 0.200%