INDEX
    Explanations

    mentions of friendships or friendly interactions

    New Auto-Interp
    Negative Logits
    tarians
    -0.69
    aeda
    -0.65
    ournal
    -0.62
    Analysis
    -0.61
     untreated
    -0.61
     chloride
    -0.59
    acent
    -0.58
    ERO
    -0.57
     heel
    -0.57
    ilion
    -0.55
    POSITIVE LOGITS
    liest
    1.67
    liness
    1.66
    lier
    1.66
    lies
    1.53
    ship
    1.21
    ships
    1.18
    finder
    0.92
    hips
    0.88
    hetical
    0.84
    nee
    0.83
    Act Density 0.040%

    No Known Activations