INDEX
    Explanations

    expressions of surprise or additional points in a conversational context

    New Auto-Interp
    Negative Logits
     intptr
    -0.39
    ニョ
    -0.38
    -0.36
     win
    -0.35
     bross
    -0.35
    kehren
    -0.34
    autaire
    -0.34
     Industrie
    -0.34
    mph
    -0.33
     Arm
    -0.32
    POSITIVE LOGITS
    SharedDtor
    0.64
     betweenstory
    0.54
    そうそう
    0.54
     שוליים
    0.54
    InjectAttribute
    0.53
    PerformLayout
    0.52
    そういえば
    0.52
    SharedCtor
    0.52
    OGND
    0.50
    Cyfarwyddwr
    0.47
    Act Density 0.020%

    No Known Activations