INDEX
    Explanations

    references to comparisons and alignments in data or belief systems

    New Auto-Interp
    Negative Logits
    .respond
    -0.16
    emento
    -0.15
    ÅĤu
    -0.14
    ега
    -0.14
    à¸Ļาà¸Ķ
    -0.14
     Ortiz
    -0.14
    OutOfRange
    -0.13
     Zig
    -0.13
     kalp
    -0.13
    unte
    -0.13
    POSITIVE LOGITS
     match
    0.59
     matches
    0.56
    -match
    0.51
    match
    0.49
     Match
    0.47
     MATCH
    0.47
    Match
    0.45
     Matches
    0.44
     matched
    0.43
     matching
    0.43
    Act Density 0.192%

    No Known Activations