INDEX
    Explanations

    phrases indicating a comparison or similarity

    New Auto-Interp
    Negative Logits
     becauſe
    -0.74
     ſtand
    -0.74
    HtmlAttribute
    -0.73
     cauſe
    -0.72
     ſche
    -0.72
    Còn
    -0.70
     ModelExpression
    -0.70
     raiſ
    -0.70
     whoſe
    -0.70
    HandlerContext
    -0.69
    POSITIVE LOGITS
     a
    0.94
     giant
    0.74
     an
    0.69
     eines
    0.65
     instanceof
    0.63
     glorified
    0.63
     like
    0.62
    是个
    0.62
     bona
    0.61
     “
    0.61
    Act Density 0.267%

    No Known Activations