INDEX
    Explanations

    names of people and locations

    New Auto-Interp
    Negative Logits
     }{@
    -0.92
    ValueStyle
    -0.90
     ModelExpression
    -0.88
    boutin
    -0.86
    rungsseite
    -0.85
     Monfieur
    -0.79
     שוליים
    -0.79
     itſelf
    -0.78
     iſt
    -0.78
    новниш
    -0.77
    POSITIVE LOGITS
     (
    0.47
     I
    0.45
     P
    0.44
    0.43
     &
    0.42
     H
    0.41
     sure
    0.37
     B
    0.37
     …
    0.37
     J
    0.37
    Act Density 0.070%

    No Known Activations