INDEX
    Explanations

    metrics related to scoring and evaluations

    New Auto-Interp
    Negative Logits
    =""/>
    -0.74
    isnan
    -0.72
    Gweler
    -0.69
    AlterField
    -0.69
     препратки
    -0.68
    zewod
    -0.67
     onCancelled
    -0.66
     Haynes
    -0.65
    gmx
    -0.65
    Hyde
    -0.64
    POSITIVE LOGITS
     scores
    1.62
     Scores
    1.54
     score
    1.50
     scored
    1.50
     Score
    1.41
     scoring
    1.37
     SCORE
    1.36
    Scores
    1.33
    scores
    1.32
    Score
    1.27
    Act Density 0.032%

    No Known Activations