INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Lemma
    -0.08
    -thumbnail
    -0.08
     stories
    -0.08
     liters
    -0.08
     medals
    -0.07
     walk
    -0.07
    भाव
    -0.07
     deth
    -0.07
    Gy
    -0.07
     racont
    -0.07
    POSITIVE LOGITS
     syntax
    0.20
     Syntax
    0.19
    syntax
    0.18
    Syntax
    0.17
    _Syntax
    0.16
    .syntax
    0.14
    yntax
    0.13
     notation
    0.13
     sint
    0.12
     synt
    0.12
    Act Density 0.030%

    No Known Activations