INDEX
    Explanations

    phrases related to question and answer formats

    New Auto-Interp
    Negative Logits
    ufact
    -0.75
    raint
    -0.68
    etheless
    -0.60
    fell
    -0.57
    ipedia
    -0.57
    wolf
    -0.56
     Goldman
    -0.56
    fitted
    -0.56
    fitting
    -0.54
     neb
    -0.53
    POSITIVE LOGITS
    FAQ
    0.91
    Answer
    0.90
    Q
    0.88
    Reply
    0.85
    Ds
    0.79
    Cs
    0.78
    answered
    0.78
    answer
    0.78
     Answer
    0.76
    LD
    0.75
    Act Density 0.022%

    No Known Activations