INDEX
    Explanations

    patterns of characters that do not form coherent words or phrases

    references to a specific symbol or character

    New Auto-Interp
    Negative Logits
    raints
    -0.97
    matic
    -0.80
     Instr
    -0.75
     slic
    -0.74
     Appalach
    -0.72
    utra
    -0.71
    urated
    -0.68
    ngth
    -0.67
     Kodi
    -0.67
     primates
    -0.67
    POSITIVE LOGITS
    âĶĢâĶĢ
    1.22
    âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
    0.93
    ľ
    0.92
    Ĺ
    0.90
    à©
    0.90
    ishable
    0.89
    Ķ
    0.88
    Ł
    0.87
    ĺ
    0.85
    ת
    0.85
    Act Density 0.042%

    No Known Activations