INDEX
    Explanations

    numbers, dates, and measurements within the text

    New Auto-Interp
    Negative Logits
     nin
    -0.18
    80
    -0.18
    72
    -0.17
     Nin
    -0.15
     sevent
    -0.15
     eighty
    -0.15
    87
    -0.15
    78
    -0.15
    79
    -0.15
    92
    -0.15
    POSITIVE LOGITS
    194
    0.83
    Û±Û¹Û´
    0.56
    195
    0.54
    Û±Û¹Ûµ
    0.40
    193
    0.37
     wartime
    0.34
     WWII
    0.31
     fasc
    0.26
     war
    0.24
     fascism
    0.23
    Act Density 0.050%

    No Known Activations