INDEX
    Explanations

    conversational text following 'model'

    New Auto-Interp
    Negative Logits
    ),
    0.39
     proportionally
    0.36
     heuristic
    0.35
     heuristics
    0.35
     proportion
    0.33
    ",
    0.33
     coefficient
    0.33
     asymptotic
    0.33
     operands
    0.33
     servic
    0.32
    POSITIVE LOGITS
    Either
    0.35
    Shows
    0.35
    Good
    0.34
    好吧
    0.34
     Small
    0.34
    Well
    0.34
    0.34
    Anyone
    0.33
    Several
    0.33
     Plenty
    0.33
    Act Density 0.189%

    No Known Activations