INDEX
    Explanations

    phrases indicating a list of multiple items or actions

    references to general concepts or items within a text

    New Auto-Interp
    Negative Logits
    Gaza
    -0.63
    ardo
    -0.63
    NAS
    -0.62
    oku
    -0.61
    bern
    -0.60
    inav
    -0.60
    irl
    -0.60
    NES
    -0.60
     Coul
    -0.60
    CVE
    -0.59
    POSITIVE LOGITS
     happened
    0.95
     happens
    0.94
     happening
    0.93
     happ
    0.86
     transpired
    0.86
     happen
    0.85
     pertaining
    0.79
     occurring
    0.78
     imaginable
    0.75
    worldly
    0.73
    Act Density 0.033%

    No Known Activations