INDEX
    Explanations

    statements that are presented as facts

    references to factual statements or claims

    New Auto-Interp
    Negative Logits
    avorite
    -0.79
     Klux
    -0.79
    jri
    -0.76
    isoft
    -0.71
    artney
    -0.69
    ModLoader
    -0.69
    interstitial
    -0.69
    ctic
    -0.68
    hod
    -0.68
     Carbuncle
    -0.67
    POSITIVE LOGITS
    ually
    1.26
    orial
    1.15
    itious
    1.05
    ional
    1.03
    oids
    0.99
    ially
    0.94
    uality
    0.92
    ual
    0.91
    oid
    0.88
    icity
    0.86
    Act Density 0.029%

    No Known Activations