INDEX
    Explanations

    information related to literature or written content

    New Auto-Interp
    Negative Logits
     hatch
    -0.65
     adventure
    -0.65
     tyr
    -0.61
    usable
    -0.60
     sunny
    -0.60
     irresistible
    -0.58
    pir
    -0.57
    veland
    -0.57
     lizard
    -0.57
    emn
    -0.56
    POSITIVE LOGITS
     nor
    1.60
    yet
    1.25
     Instead
    1.21
    nor
    1.06
     anymore
    1.03
     Rather
    0.99
     Nonetheless
    0.98
     Nor
    0.97
     Nevertheless
    0.94
    unless
    0.93
    Act Density 4.915%

    No Known Activations