INDEX
    Explanations

    references to historical events, particularly related to World War II

    New Auto-Interp
    Negative Logits
     Valentine
    -0.15
     dern
    -0.15
     
    -0.15
    ainter
    -0.15
    æĭ
    -0.15
     Ze
    -0.14
     optic
    -0.14
     torpedo
    -0.14
     Joseph
    -0.13
    teri
    -0.13
    POSITIVE LOGITS
     Norm
    0.46
    Norm
    0.38
     norm
    0.32
     landing
    0.31
     Landing
    0.28
    landing
    0.27
    norm
    0.26
    (norm
    0.25
    .norm
    0.25
     Norman
    0.23
    Act Density 0.023%

    No Known Activations