INDEX
    Explanations

    phrases indicating direct communication and assistance

    New Auto-Interp
    Negative Logits
    ehler
    -0.19
    eniz
    -0.16
    اÙĦات
    -0.16
    801
    -0.15
    opoulos
    -0.15
     sodom
    -0.15
    ussen
    -0.15
     Leak
    -0.14
    .med
    -0.14
    gae
    -0.14
    POSITIVE LOGITS
    Modifiers
    0.15
     brides
    0.14
    upos
    0.14
    quis
    0.14
     lim
    0.14
     wed
    0.14
    /functions
    0.14
    liÄį
    0.14
    wed
    0.13
     hon
    0.13
    Act Density 0.000%

    No Known Activations