INDEX
    Explanations

    occurrences of specific character patterns or symbols

    New Auto-Interp
    Negative Logits
     manif
    -0.87
     disadvant
    -0.87
     misunder
    -0.86
     levers
    -0.83
     Vaugh
    -0.80
     vulner
    -0.78
    geries
    -0.76
     promoters
    -0.75
     incorpor
    -0.74
     fronts
    -0.74
    POSITIVE LOGITS
    ï¸ı
    1.38
     âĢº
    0.92
    cffffcc
    0.90
    âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
    0.86
    âĸ
    0.85
    ש
    0.83
    âĸ¬âĸ¬
    0.83
    HUD
    0.82
    ×
    0.80
    STAR
    0.80
    Act Density 0.106%

    No Known Activations