INDEX
Explanations
topics related to romantic relationships and their complexities
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.03
3:0.04
4:0.04
5:0.07
6:0.02
7:0.04
8:0.03
9:0.09
10:0.40
11:0.15
Negative Logits
Archae
-1.41
oggles
-1.32
modules
-1.27
stadiums
-1.27
effects
-1.25
asive
-1.25
icons
-1.24
archaeologists
-1.23
pollutants
-1.21
RGB
-1.21
POSITIVE LOGITS
roommate
1.61
boyfriend
1.61
lover
1.60
spouse
1.55
fiance
1.47
husband
1.45
Loving
1.36
reunion
1.35
friendship
1.35
bedroom
1.34
Activations Density 0.451%