INDEX
    Explanations

    mentions of marriage and relationships

    New Auto-Interp
    Negative Logits
    kyt
    -0.15
    bate
    -0.15
    eken
    -0.15
    swer
    -0.15
     dames
    -0.14
    ISCO
    -0.14
    Principal
    -0.14
     gec
    -0.14
    á»
    -0.14
     showc
    -0.14
    POSITIVE LOGITS
     estr
    0.21
     ex
    0.20
     model
    0.19
     beau
    0.18
    -model
    0.17
    rum
    0.17
    model
    0.17
     cheating
    0.16
     rum
    0.16
     Estr
    0.16
    Act Density 0.054%

    No Known Activations