INDEX
    Explanations

    clauses or phrases expressing claims, decisions, or proposals

    New Auto-Interp
    Negative Logits
    IntoConstraints
    -0.55
    GenerationType
    -0.49
     blessures
    -0.43
    mergeFrom
    -0.40
    
    -0.38
    Slf
    -0.38
    เกิน
    -0.36
     žena
    -0.35
    mesi
    -0.35
     själv
    -0.35
    POSITIVE LOGITS
     themselves
    1.20
    Their
    1.11
     Their
    1.10
    themselves
    1.10
    their
    1.09
     their
    1.09
     they
    0.94
     THEIR
    0.93
    they
    0.92
     mereka
    0.92
    Act Density 0.763%

    No Known Activations