INDEX
    Explanations

    references to numbers, particularly related to quantity or lists

    New Auto-Interp
    Negative Logits
    3
    -0.76
    4
    -0.75
    5
    -0.72
    7
    -0.71
    6
    -0.69
    2
    -0.69
    0
    -0.68
    9
    -0.68
    8
    -0.67
    1
    -0.64
    POSITIVE LOGITS
    aarrggbb
    1.05
     huit
    1.05
    Према
    1.01
    four
    1.00
     eight
    0.98
    Four
    0.98
     seven
    0.96
    eight
    0.95
    five
    0.95
    seven
    0.95
    Act Density 0.121%

    No Known Activations