INDEX
Explanations
mentions of clothing items related to the upper body
references to bras and related garments
New Auto-Interp
Negative Logits
Memor
-0.73
matic
-0.73
eers
-0.73
Entertainment
-0.69
Chronicle
-0.66
Democr
-0.65
dated
-0.64
SEE
-0.64
Terrorism
-0.63
Procedure
-0.63
POSITIVE LOGITS
bras
1.06
bra
1.04
livest
0.95
ille
0.94
tiss
0.88
¥ŀ
0.88
tradem
0.87
ided
0.86
streng
0.86
etooth
0.86
Activations Density 0.006%