INDEX
    Explanations

    question starters (who, how)

    New Auto-Interp
    Negative Logits
     heuristics
    0.52
     polynomial
    0.49
     determining
    0.49
     amyloid
    0.48
    determining
    0.48
     coefficients
    0.47
     heuristic
    0.47
     uncertainty
    0.46
     nutrient
    0.46
     instability
    0.46
    POSITIVE LOGITS
    ok
    0.69
    you
    0.62
    who
    0.61
    could
    0.58
    Ok
    0.57
    Alright
    0.55
     اوكي
    0.55
    lol
    0.55
    Who
    0.54
    YOU
    0.54
    Act Density 0.001%

    No Known Activations