INDEX
    Explanations

    evaluate capabilities or test performance

    New Auto-Interp
    Negative Logits
    four
    0.59
    medical
    0.58
    nine
    0.54
    building
    0.53
    five
    0.52
    smoothing
    0.52
    pandemic
    0.52
    no
    0.50
    muscle
    0.50
    housing
    0.49
    POSITIVE LOGITS
    0.51
    のために
    0.48
     Eesti
    0.46
     Spieler
    0.46
    0.46
     HomePage
    0.45
     Ελλάδα
    0.45
     überras
    0.45
    0.45
     *)&
    0.44
    Act Density 0.000%

    No Known Activations