INDEX
    Explanations

    various data points, particularly numbers and their associated contexts

    New Auto-Interp
    Negative Logits
    elho
    -0.19
    usra
    -0.17
    habi
    -0.17
     ^{°}
    -0.15
    afari
    -0.15
    hab
    -0.15
    URRE
    -0.15
     Kenn
    -0.14
    urette
    -0.14
    ROTO
    -0.14
    POSITIVE LOGITS
    Ñģлов
    0.16
    ha
    0.16
    ÏĦά
    0.15
    315
    0.15
     Jer
    0.15
    ordinate
    0.15
    Jar
    0.15
    jar
    0.14
    éħį
    0.14
    lear
    0.14
    Act Density 0.002%

    No Known Activations