INDEX
    Explanations

    terms related to matching and comparison in various contexts

    New Auto-Interp
    Negative Logits
    quirrel
    -0.91
    zzleHttp
    -0.87
     kasarigan
    -0.87
    🏼
    -0.77
     généraux
    -0.77
    бенок
    -0.77
    skyl
    -0.76
    thâu
    -0.76
    Vader
    -0.76
    hehehe
    -0.76
    POSITIVE LOGITS
     MATCH
    2.20
     match
    2.15
     Match
    2.13
    Match
    2.06
     matches
    2.06
    match
    2.03
    MATCH
    2.02
     Matches
    1.92
    matches
    1.77
     matched
    1.71
    Act Density 0.057%

    No Known Activations