INDEX
    Explanations

    phrases and words that indicate temporal references or sequence of events

    invention, comparison, or specific treatments

    New Auto-Interp
    Negative Logits
    featureID
    -0.88
     Италијани
    -0.85
    iſchen
    -0.82
    niſſe
    -0.79
    httphttps
    -0.78
     ***!
    -0.77
    iſche
    -0.77
     Wikimedijinoj
    -0.77
    WireFormatLite
    -0.75
    ſſung
    -0.75
    POSITIVE LOGITS
     vœ
    0.42
     désir
    0.33
     adjunto
    0.32
     derfor
    0.32
     mož
    0.32
     curieux
    0.31
    vábbi
    0.31
     therefore
    0.31
     Investigación
    0.31
     Espíritu
    0.31
    Act Density 0.129%

    No Known Activations