INDEX
    Explanations

    references to specific time periods, such as centuries and years

    references to time periods and societal contexts

    New Auto-Interp
    Negative Logits
    inki
    -0.72
    cause
    -0.70
    kefeller
    -0.63
    ãĥĦ
    -0.62
    jong
    -0.62
    Dro
    -0.62
    soType
    -0.61
    appropriately
    -0.60
    him
    -0.59
    cheat
    -0.59
    POSITIVE LOGITS
     there
    1.00
    ,
    0.91
    adays
    0.83
     we
    0.81
     it
    0.80
     however
    0.80
     these
    0.77
     tens
    0.76
     nobody
    0.75
     they
    0.73
    Act Density 0.281%

    No Known Activations