INDEX
    Explanations

    references to performance metrics and evaluation criteria in tasks or experiments

    New Auto-Interp
    Negative Logits
     betweenstory
    -0.59
    hyrchwyd
    -0.58
     propOrder
    -0.55
    Географиясе
    -0.53
    Билгалдахарш
    -0.51
     يتيمه
    -0.50
    contentLoaded
    -0.50
    Personendaten
    -0.49
     оригіналу
    -0.48
    tanleria
    -0.47
    POSITIVE LOGITS
     count
    0.74
     counts
    0.72
     counting
    0.71
     number
    0.71
     counted
    0.69
     COUNT
    0.69
    count
    0.65
    counts
    0.65
     Count
    0.64
     numbers
    0.64
    Act Density 0.988%

    No Known Activations