INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Buddhism
    -0.07
     photographers
    -0.06
     пред
    -0.06
     жит
    -0.06
    -0.06
     devices
    -0.06
     partic
    -0.06
    -0.06
    screens
    -0.06
    charger
    -0.06
    POSITIVE LOGITS
     monopoly
    0.14
     monopol
    0.11
    opoly
    0.08
     unanimous
    0.07
     Только
    0.07
    ์,
    0.07
     typename
    0.07
     Lever
    0.07
     Те
    0.06
     Donovan
    0.06
    Act Density 0.002%

    No Known Activations