INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     scouts
    -0.07
     teachers
    -0.07
    üst
    -0.06
    -0.06
    Mark
    -0.06
    вание
    -0.06
     definite
    -0.06
     Wenn
    -0.06
    lte
    -0.06
     sofa
    -0.06
    POSITIVE LOGITS
    \E
    0.07
     نیروی
    0.07
     अब
    0.06
     Industrial
    0.06
     Composition
    0.06
    _fc
    0.06
     composition
    0.06
     conclude
    0.06
    <?>
    0.06
    /command
    0.06
    Act Density 0.019%

    No Known Activations