INDEX
    Explanations

    mentions of close relationships and strong emotional connections

    references to friendship and social relationships

    New Auto-Interp
    Negative Logits
    overe
    -0.78
    vertisement
    -0.76
    authorized
    -0.72
    ucks
    -0.71
    itial
    -0.68
    inis
    -0.67
    informed
    -0.67
    odo
    -0.65
    ijing
    -0.63
     deliber
    -0.62
    POSITIVE LOGITS
     Romeo
    0.87
     Valerie
    0.77
    lier
    0.75
    hips
    0.74
    liness
    0.72
     Draco
    0.71
     Huma
    0.71
     Giul
    0.68
     Flavoring
    0.67
     Tad
    0.67
    Act Density 0.104%

    No Known Activations