INDEX
    Explanations

    Transformers

    New Auto-Interp
    Negative Logits
     wartet
    -0.10
     anesthesia
    -0.09
    יסוי
    -0.09
    =date
    -0.09
     anest
    -0.09
     stilte
    -0.08
    -0.08
     tasa
    -0.08
    患者
    -0.08
     Therapie
    -0.08
    POSITIVE LOGITS
     villain
    0.09
     rebel
    0.09
     defeated
    0.09
     villains
    0.09
     guer
    0.08
     кораб
    0.08
     guerre
    0.08
     faction
    0.08
     debris
    0.08
     factions
    0.08
    Act Density 0.009%

    No Known Activations