INDEX
    Explanations

    statements related to the qualities or properties of subjects

    New Auto-Interp
    Negative Logits
     Kar
    -0.16
    ril
    -0.16
     plans
    -0.14
    Kar
    -0.14
     Jet
    -0.14
    avar
    -0.14
    illery
    -0.14
    atoms
    -0.14
    notated
    -0.14
    ô
    -0.13
    POSITIVE LOGITS
    itra
    0.17
    elter
    0.15
    iesel
    0.15
    θή
    0.15
    eration
    0.14
    -tm
    0.14
    TRGL
    0.14
    utow
    0.14
    erator
    0.14
    strate
    0.14
    Act Density 0.079%

    No Known Activations