INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     aerial
    -0.08
     causal
    -0.07
     UIApplication
    -0.06
     CWE
    -0.06
     Khal
    -0.06
    eder
    -0.06
     nepří
    -0.06
     Predicate
    -0.06
    BED
    -0.06
    andler
    -0.06
    POSITIVE LOGITS
     Fort
    0.12
    Fort
    0.11
    fort
    0.09
     fort
    0.09
     Forty
    0.08
    furt
    0.07
     fortress
    0.07
    0.07
     font
    0.07
     forty
    0.07
    Act Density 0.006%

    No Known Activations