INDEX
    Explanations

    phrases indicating structural or functional components in technical descriptions

    New Auto-Interp
    Negative Logits
     myſelf
    -0.91
     becauſe
    -0.75
    TagMode
    -0.72
     whoſe
    -0.71
     themſelves
    -0.69
    ſelf
    -0.68
     himſelf
    -0.68
    IntoConstraints
    -0.67
    DriverManager
    -0.66
     Jefus
    -0.65
    POSITIVE LOGITS
     features
    0.69
     contain
    0.66
    contains
    0.63
     contains
    0.61
     include
    0.59
    include
    0.57
    包含
    0.56
     bevat
    0.56
    features
    0.56
     Contains
    0.55
    Act Density 0.845%

    No Known Activations