INDEX
    Explanations

    expressions related to friendship and close relationships

    New Auto-Interp
    Negative Logits
    urement
    -0.15
    айд
    -0.15
    elter
    -0.15
    ushima
    -0.14
    ryo
    -0.13
    ddit
    -0.13
    factor
    -0.13
    oleÄį
    -0.13
    fusion
    -0.13
    oub
    -0.13
    POSITIVE LOGITS
     friends
    0.77
     friend
    0.69
     FRIEND
    0.65
     Friends
    0.65
    friends
    0.62
    æľĭåıĭ
    0.60
    Friends
    0.60
     Friend
    0.57
    friend
    0.56
    riends
    0.53
    Act Density 0.286%

    No Known Activations