INDEX
    Explanations

    references to specific historical figures or texts

    New Auto-Interp
    Negative Logits
    ftime
    -0.20
    abinet
    -0.17
     veter
    -0.14
     cuckold
    -0.14
    ÙĦÙ쨩
    -0.14
     جÙĨ
    -0.14
    ocard
    -0.14
    son
    -0.14
    apsulation
    -0.14
    cast
    -0.14
    POSITIVE LOGITS
     Anne
    0.38
    Anne
    0.31
     Frank
    0.30
     anne
    0.26
    Frank
    0.25
     diary
    0.24
     Otto
    0.24
     Amsterdam
    0.24
     Holland
    0.23
     Dutch
    0.23
    Act Density 0.001%

    No Known Activations