INDEX
    Explanations

    phrases related to relationships and interpersonal connections

    New Auto-Interp
    Negative Logits
    ga
    -0.15
     taking
    -0.15
    takes
    -0.15
    uzu
    -0.15
    lox
    -0.15
    93
    -0.15
    annie
    -0.14
    éł
    -0.14
    idth
    -0.14
    uper
    -0.14
    POSITIVE LOGITS
     seriously
    0.23
     hostage
    0.22
    .setViewport
    0.18
     places
    0.18
     prisoner
    0.17
     Seriously
    0.17
     advantage
    0.16
     Liberties
    0.16
    вал
    0.16
    _places
    0.15
    Act Density 0.032%

    No Known Activations