INDEX
    Explanations

    fractional values, such as measurements in halves

    occurrences of specific numerical values and names

    New Auto-Interp
    Negative Logits
    rine
    -0.92
    alia
    -0.81
    ories
    -0.78
    opher
    -0.77
    utical
    -0.77
    rums
    -0.77
    alus
    -0.75
    ancy
    -0.75
    uating
    -0.74
    rina
    -0.73
    POSITIVE LOGITS
    Ò
    0.77
    cffffcc
    0.73
    abouts
    0.70
     Flavoring
    0.68
    jud
    0.68
    uberty
    0.68
     hearts
    0.67
    WAYS
    0.65
     Mandela
    0.65
    xual
    0.63
    Act Density 0.030%

    No Known Activations