INDEX
    Explanations

    numeric values related to statistics or scores

    New Auto-Interp
    Negative Logits
     seventeen
    -0.33
     nineteen
    -0.33
     sixteen
    -0.31
     twenty
    -0.31
     eighteen
    -0.30
    äºĮåįģ
    -0.29
     fifteen
    -0.29
     XIV
    -0.29
     двад
    -0.29
     Twenty
    -0.29
    POSITIVE LOGITS
    12
    0.55
    11
    0.54
    10
    0.51
    9
    0.50
    8
    0.49
    7
    0.48
    6
    0.47
    5
    0.47
    4
    0.45
    3
    0.44
    Act Density 0.065%

    No Known Activations