INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     inherit
    -0.68
     Genie
    -0.60
    avorite
    -0.59
     dearly
    -0.59
     accurately
    -0.59
     distortions
    -0.57
    Reviewer
    -0.56
     truly
    -0.56
     gorilla
    -0.56
    /"
    -0.54
    POSITIVE LOGITS
    imester
    0.98
    morning
    0.91
     afternoon
    0.83
    cture
    0.82
    ciating
    0.80
     evening
    0.79
    season
    0.74
     gestation
    0.74
    eteenth
    0.73
    cheon
    0.70
    Act Density 0.773%

    No Known Activations