INDEX
    Explanations

    references to historical figures and events

    New Auto-Interp
    Negative Logits
    149
    -0.18
    157
    -0.18
    161
    -0.17
     Napoleon
    -0.17
    celik
    -0.16
    154
    -0.16
    155
    -0.16
    166
    -0.15
    162
    -0.15
    mdl
    -0.15
    POSITIVE LOGITS
     Norman
    0.29
     Norm
    0.29
     norm
    0.26
    Norm
    0.26
     ноÑĢм
    0.24
    norm
    0.22
     norms
    0.21
    .norm
    0.19
    _norm
    0.19
     Counts
    0.18
    Act Density 0.044%

    No Known Activations