INDEX
    Explanations

    phrases attributing authorship or responsibility

    New Auto-Interp
    Negative Logits
     co
    -0.49
     sa
    -0.47
     all
    -0.44
     sent
    -0.44
    <eos>
    -0.44
     her
    -0.42
    ոյ
    -0.41
     indu
    -0.41
     k
    -0.41
     no
    -0.40
    POSITIVE LOGITS
     Monfieur
    1.05
     Efq
    0.98
     itſelf
    0.98
     Jefus
    0.94
     Reſ
    0.94
    UnusedPrivate
    0.92
     Theſe
    0.92
    WebElementEntity
    0.92
     doubtnut
    0.91
     ſche
    0.90
    Act Density 0.000%

    No Known Activations