INDEX
    Explanations

    phrases indicating feelings of empathy or sympathy towards others

    New Auto-Interp
    Negative Logits
    edin
    -0.75
    hess
    -0.75
    UP
    -0.74
    FORE
    -0.73
    ashtra
    -0.72
    forward
    -0.69
    oller
    -0.67
    orbit
    -0.66
    mare
    -0.66
    ohn
    -0.65
    POSITIVE LOGITS
    bidden
    1.08
    gotten
    1.06
    geries
    0.87
     starters
    0.79
    ties
    0.78
     sake
    0.78
     example
    0.76
     centuries
    0.75
     them
    0.72
     decades
    0.71
    Act Density 0.144%

    No Known Activations