INDEX
    Explanations

    instances of the word "match" or variations thereof

    New Auto-Interp
    Negative Logits
    quirrel
    -0.81
    zzleHttp
    -0.76
    бенок
    -0.74
    thâu
    -0.72
    skyl
    -0.70
     kasarigan
    -0.70
    ::-
    -0.70
    hehe
    -0.69
    Vader
    -0.69
    -0.69
    POSITIVE LOGITS
     match
    2.63
     Match
    2.58
     MATCH
    2.58
    Match
    2.50
    match
    2.50
     matches
    2.44
    MATCH
    2.32
     Matches
    2.29
    matches
    2.06
    Matches
    2.02
    Act Density 0.050%

    No Known Activations