INDEX
    Explanations

    key emotional and relational words and phrases that indicate feelings and connections between people

    New Auto-Interp
    Negative Logits
    ξη
    -0.19
     zas
    -0.14
     Walters
    -0.14
    .Classes
    -0.13
    ationship
    -0.13
    ξ
    -0.13
     erotisk
    -0.12
    ÙĬÙĨÙĬØ©
    -0.12
    ename
    -0.12
    NewLabel
    -0.12
    POSITIVE LOGITS
     each
    0.28
     Each
    0.27
     EACH
    0.24
    Each
    0.24
    each
    0.24
    åIJĦ
    0.20
    .each
    0.19
     каждого
    0.19
    _each
    0.18
     åIJĦ
    0.18
    Act Density 0.009%

    No Known Activations