INDEX
    Explanations

    phrases implying contrasting viewpoints or actions

    New Auto-Interp
    Negative Logits
     Juf
    -0.97
     aen
    -0.94
     NOO
    -0.93
     thut
    -0.90
     „,
    -0.90
     Hano
    -0.90
     ftu
    -0.88
     Febru
    -0.88
     ufe
    -0.88
     nomine
    -0.88
    POSITIVE LOGITS
    <bos>
    0.58
    AccessorTable
    0.53
     desire
    0.52
    bidden
    0.49
    ември
    0.49
    USTAIN
    0.48
     vlieg
    0.48
    ValueStyle
    0.48
    AppRoutingModule
    0.47
    Necesito
    0.46
    Act Density 0.146%

    No Known Activations