INDEX
    Explanations

    mentions of academic subjects or concepts, particularly mathematics and science

    New Auto-Interp
    Negative Logits
     Sov
    -0.79
    ktop
    -0.70
    lease
    -0.65
    views
    -0.65
    Dialogue
    -0.61
    hold
    -0.60
     Marcos
    -0.59
     flesh
    -0.59
     Border
    -0.58
     Luxem
    -0.58
    POSITIVE LOGITS
    matical
    1.18
    hematic
    1.12
    hemat
    1.02
    ilda
    1.01
    ieu
    1.01
    ilde
    0.98
    ias
    0.92
     equations
    0.90
    ians
    0.86
    gebra
    0.86
    Act Density 0.027%

    No Known Activations