INDEX
    Explanations

    proper nouns with initials or abbreviations

    capital letters, particularly of names or titles

    New Auto-Interp
    Negative Logits
    ModLoader
    -0.81
    dylib
    -0.75
    Topics
    -0.71
    FACE
    -0.68
    culosis
    -0.68
    Relations
    -0.67
    20439
    -0.67
    Contents
    -0.66
    duino
    -0.65
    âĶĢâĶĢ
    -0.64
    POSITIVE LOGITS
    itte
    0.78
    ahn
    0.74
    iner
    0.73
    uty
    0.70
    ze
    0.70
    oust
    0.69
    oner
    0.69
    utsch
    0.67
    erman
    0.67
    ham
    0.66
    Act Density 0.257%

    No Known Activations