INDEX
    Explanations

    words related to reasoning and justifications

    New Auto-Interp
    Negative Logits
    esgue
    -0.91
    gatsby
    -0.71
    Personensuche
    -0.70
    omiast
    -0.70
     Cæsar
    -0.69
    FundMe
    -0.69
    nachron
    -0.67
    iconductor
    -0.66
    gameserver
    -0.65
     lavable
    -0.64
    POSITIVE LOGITS
     we
    1.20
     you
    1.12
     they
    1.06
     it
    0.99
     someone
    0.94
     that
    0.84
     he
    0.82
     everyone
    0.82
     anyone
    0.81
     people
    0.77
    Act Density 0.051%

    No Known Activations