INDEX
    Explanations

    the infinitive form of verbs, particularly "to" followed by another verb

    New Auto-Interp
    Negative Logits
    anca
    -0.17
    iliz
    -0.16
     fewer
    -0.15
    @student
    -0.15
    uc
    -0.15
    abei
    -0.14
    ìķħ
    -0.14
     chances
    -0.14
    antly
    -0.14
    patches
    -0.14
    POSITIVE LOGITS
    sap
    0.16
    s
    0.14
    oping
    0.14
    Reuse
    0.14
    MODE
    0.14
     Rough
    0.14
     best
    0.13
    ÃŃž
    0.13
    ops
    0.13
    eyh
    0.13
    Act Density 0.039%

    No Known Activations