INDEX
Explanations
mentions of the name "Taylor Swift"
mentions of the name "Taylor."
New Auto-Interp
Negative Logits
PDATE
-0.89
rontal
-0.89
undai
-0.86
oard
-0.85
hemy
-0.84
cffffcc
-0.80
ntil
-0.77
cffff
-0.77
vous
-0.75
rely
-0.73
POSITIVE LOGITS
Made
1.12
Swift
1.02
obi
0.80
Taylor
0.78
ville
0.72
ivation
0.71
hurst
0.70
mann
0.70
ual
0.70
ying
0.69
Activations Density 0.020%