INDEX
    Explanations

    phrases related to legal actions and decisions

    New Auto-Interp
    Negative Logits
    åĪĹ
    -0.16
    irting
    -0.15
     vielen
    -0.15
    lund
    -0.15
    _TP
    -0.14
    ico
    -0.14
    ox
    -0.13
    oha
    -0.13
    anuts
    -0.13
    kt
    -0.13
    POSITIVE LOGITS
     one
    0.50
     first
    0.35
     ones
    0.33
    åĪĨåĪ«
    0.33
    one
    0.31
    :first
    0.31
     primero
    0.30
    —one
    0.30
    -one
    0.29
    first
    0.28
    Act Density 0.191%

    No Known Activations