INDEX
    Explanations

    metrics and statistics associated with performance and scoring

    New Auto-Interp
    Negative Logits
     Sutton
    -0.15
    933
    -0.15
    ilon
    -0.15
    asin
    -0.14
     Siz
    -0.14
    sip
    -0.14
    ilion
    -0.14
     Stefan
    -0.14
    784
    -0.14
    713
    -0.14
    POSITIVE LOGITS
     score
    0.72
     scores
    0.64
    score
    0.62
    -score
    0.60
     Score
    0.60
    _score
    0.57
    Score
    0.57
    .score
    0.56
    -sc
    0.54
     Scores
    0.53
    Act Density 0.098%

    No Known Activations