INDEX
    Explanations

    phrases or statements within quotation marks

    New Auto-Interp
    Negative Logits
    ,
    -0.68
    )=
    -0.65
    Ͻ
    -0.61
    .
    -0.60
    ulic
    -0.60
    )
    -0.59
    itan
    -0.57
    )/
    -0.55
    paren
    -0.55
    )\
    -0.55
    POSITIVE LOGITS
    /"
    0.99
    ]:
    0.58
     --
    0.57
    sic
    0.56
     namely
    0.56
     ["
    0.55
    that
    0.54
     because
    0.53
     referring
    0.53
    0.53
    Act Density 0.111%

    No Known Activations