INDEX
    Explanations

    references to time, particularly the past

    New Auto-Interp
    Negative Logits
    fav
    -0.18
    ört
    -0.16
    jav
    -0.15
    iasi
    -0.15
     Dale
    -0.14
    ouns
    -0.14
    annies
    -0.14
     Lair
    -0.14
    nerg
    -0.14
    vore
    -0.14
    POSITIVE LOGITS
    alat
    0.17
    eto
    0.16
    redis
    0.16
    exe
    0.15
    arro
    0.15
     оÑģÑĮ
    0.15
     Fach
    0.14
    çIJ³
    0.14
     zbo
    0.14
    INY
    0.14
    Act Density 0.203%

    No Known Activations