INDEX
    Explanations

    numerical data and group classifications

    New Auto-Interp
    Negative Logits
     Garc
    -0.72
    ped
    -0.71
    ãĥīãĥ©
    -0.70
    writers
    -0.69
    Cro
    -0.69
    respons
    -0.68
    PO
    -0.64
    rote
    -0.64
     Huss
    -0.62
     Zo
    -0.61
    POSITIVE LOGITS
    Ĵ
    0.69
    âĶĢâĶĢâĶĢâĶĢ
    0.69
     âĹı
    0.69
    ===
    0.69
    ®
    0.65
    ãĢij
    0.65
    çͰ
    0.64
    Ĩ
    0.61
    ²
    0.60
    ãĤ¤ãĥĪ
    0.60
    Act Density 0.039%

    No Known Activations