INDEX
    Explanations

    words or phrases that indicate some kind of knowledge like "know", "believe", or hints about a plan

    New Auto-Interp
    Negative Logits
     think
    -1.49
     know
    -1.44
     believe
    -1.32
     want
    -1.23
     hope
    -1.18
     say
    -1.18
     see
    -1.16
     wish
    -1.15
     feel
    -1.13
     appreciate
    -1.13
    POSITIVE LOGITS
    OGND
    0.60
    enerbah
    0.56
    0.53
    Cubit
    0.53
     terem
    0.53
     serem
    0.51
    MongoClient
    0.50
    windowFixed
    0.50
     Смо
    0.49
    Save
    0.49
    Act Density 4.622%

    No Known Activations