INDEX
    Explanations

    topics related to romantic relationships and their complexities

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.03
    3:0.04
    4:0.04
    5:0.07
    6:0.02
    7:0.04
    8:0.03
    9:0.09
    10:0.40
    11:0.15
    Negative Logits
     Archae
    -1.41
    oggles
    -1.32
    modules
    -1.27
     stadiums
    -1.27
    effects
    -1.25
    asive
    -1.25
    icons
    -1.24
     archaeologists
    -1.23
     pollutants
    -1.21
    RGB
    -1.21
    POSITIVE LOGITS
     roommate
    1.61
     boyfriend
    1.61
     lover
    1.60
     spouse
    1.55
     fiance
    1.47
     husband
    1.45
     Loving
    1.36
     reunion
    1.35
     friendship
    1.35
     bedroom
    1.34
    Act Density 0.451%

    No Known Activations