INDEX
Explanations
references to the character Elsa from the Frozen franchise
New Auto-Interp
Negative Logits
oday
-0.76
aceutical
-0.75
oons
-0.72
aneers
-0.71
addafi
-0.71
-+-+
-0.68
pora
-0.66
ritic
-0.65
onal
-0.64
ategory
-0.63
POSITIVE LOGITS
issance
0.93
Elsa
0.83
Elsa
0.80
herself
0.79
ette
0.74
ÃŃs
0.72
ipeg
0.68
Maria
0.66
Lopez
0.65
Anna
0.64
Activations Density 0.004%