INDEX
    Explanations

    dialogue and statements

    New Auto-Interp
    Negative Logits
    ériment
    0.72
    ipple
    0.71
    ipation
    0.67
    ambda
    0.64
    感受
    0.64
    Digest
    0.63
     נישט
    0.62
    ప్రత్యర్థి
    0.62
    ർച്ച
    0.61
    landmark
    0.61
    POSITIVE LOGITS
     told
    2.10
     informed
    1.93
     explained
    1.90
     insisted
    1.79
     begged
    1.70
    told
    1.65
     warned
    1.62
     assured
    1.57
     informs
    1.57
     promised
    1.57
    Act Density 0.134%

    No Known Activations