INDEX
    Explanations

    instances where comparisons are made between different subjects or methods

    New Auto-Interp
    Negative Logits
     Rosenberg
    -0.71
    Bism
    -0.64
    lidene
    -0.63
    entgen
    -0.63
    jectories
    -0.63
     Man
    -0.59
    ubourg
    -0.59
    tserrat
    -0.58
     irgende
    -0.58
     episódios
    -0.57
    POSITIVE LOGITS
     comparison
    2.27
     comparisons
    2.21
     Comparisons
    2.12
     Comparison
    2.03
     comparing
    2.02
    Comparison
    1.90
     compares
    1.89
     Compare
    1.87
     Comparing
    1.87
    comparison
    1.86
    Act Density 0.119%

    No Known Activations