INDEX
    Explanations

    instances of the words "that" and "this."

    New Auto-Interp
    Negative Logits
    InjectAttribute
    -0.83
     Theſe
    -0.78
     noDo
    -0.78
    AddTagHelper
    -0.77
    )"),
    -0.75
     ProtoMessage
    -0.72
    providedIn
    -0.71
    viewDid
    -0.71
     ?',
    -0.70
    səhifə
    -0.69
    POSITIVE LOGITS
    ,
    0.61
     Walkover
    0.57
    נטרנט
    0.52
     can
    0.50
    .
    0.50
    weisung
    0.49
    sika
    0.47
     пути
    0.47
    Bericht
    0.47
     tatu
    0.46
    Act Density 0.109%

    No Known Activations