INDEX
    Explanations

    The neuron is looking for words related to expressions of affection

    expressions of affection and fondness

    New Auto-Interp
    Negative Logits
    ulhu
    -0.92
    ÄŁ
    -0.76
    ramid
    -0.74
    akedown
    -0.71
    soDeliveryDate
    -0.70
    krit
    -0.65
    medi
    -0.65
    ozo
    -0.63
    proof
    -0.62
    DoS
    -0.62
    POSITIVE LOGITS
     affection
    1.00
    ately
    0.98
     fond
    0.83
     kisses
    0.76
    ate
    0.73
    uously
    0.72
    76561
    0.70
    atile
    0.68
     affinity
    0.68
     passionately
    0.67
    Act Density 0.059%

    No Known Activations