INDEX
    Explanations

    headings or sections that introduce new topics or principles

    New Auto-Interp
    Negative Logits
     Gegenteil
    -0.61
    TestBed
    -0.58
     Hok
    -0.58
     Compagn
    -0.57
     Sharif
    -0.56
     причем
    -0.56
    derry
    -0.56
     Channing
    -0.56
     cib
    -0.56
     sopp
    -0.55
    POSITIVE LOGITS
    TypedDataSet
    0.88
     Estelle
    0.86
    ării
    0.82
    ţiei
    0.82
     Espinosa
    0.81
    ും
    0.80
     Neale
    0.80
    Den
    0.78
     LSM
    0.78
     Einen
    0.76
    Act Density 0.041%

    No Known Activations