INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Assume
    -0.46
     Assume
    -0.44
    Stephan
    -0.41
     origine
    -0.38
     inactivation
    -0.37
    Hm
    -0.37
     disadvantages
    -0.36
     suppos
    -0.36
     Osborn
    -0.36
    Handle
    -0.36
    POSITIVE LOGITS
     poetry
    2.20
     Poetry
    2.14
    poetry
    1.98
    Poetry
    1.98
     POETRY
    1.77
     poesía
    1.52
     poesia
    1.44
     poésie
    1.37
     poets
    1.30
     poet
    1.23
    Act Density 0.003%

    No Known Activations