INDEX
    Explanations

    references to attributes or qualities related to various topics

    New Auto-Interp
    Negative Logits
     avoient
    -1.15
     auroit
    -1.12
     feroit
    -1.09
     pouvoit
    -1.09
     ainfi
    -1.09
     Efq
    -1.08
     Majefty
    -1.08
     zoude
    -1.08
     myſelf
    -1.07
     quæ
    -1.07
    POSITIVE LOGITS
    ̣c
    0.79
    0.62
     (
    0.61
     a
    0.61
    [toxicity=0]
    0.60
    ́i
    0.58
    ?
    0.57
    ,
    0.57
     Ro
    0.55
    s
    0.54
    Act Density 0.161%

    No Known Activations