INDEX
    Explanations

    numerical data or statistics in a document

    New Auto-Interp
    Negative Logits
     Efq
    -0.91
     Monfieur
    -0.85
     Eſ
    -0.84
     ſta
    -0.78
    LookAnd
    -0.77
     ſever
    -0.76
     Reſ
    -0.76
     faſt
    -0.76
     scattata
    -0.76
     auffi
    -0.74
    POSITIVE LOGITS
    [toxicity=0]
    0.74
    Slf
    0.64
    *
    0.63
    <eos>
    0.62
    0.52
    ̀u
    0.52
    0.51
    uxxxx
    0.50
     ↑
    0.50
    </s>
    0.50
    Act Density 0.164%

    No Known Activations