INDEX
    Explanations

    expressions that emphasize key observations or opinions about various topics

    New Auto-Interp
    Negative Logits
    strup
    -0.20
    або
    -0.15
    nish
    -0.15
    ä¹
    -0.15
    ook
    -0.15
    ίÏĦ
    -0.14
    905
    -0.14
    uch
    -0.13
    PFN
    -0.13
     âĢı
    -0.13
    POSITIVE LOGITS
     thing
    0.37
     Thing
    0.30
    Thing
    0.30
    thing
    0.27
    THING
    0.21
    ãģĵãģ¨ãģ«
    0.20
     part
    0.19
    (thing
    0.18
     fact
    0.17
     totiž
    0.16
    Act Density 0.108%

    No Known Activations