INDEX
    Explanations

    German instructions and phrasing

    New Auto-Interp
    Negative Logits
     Duits
    0.62
    DAS
    0.60
     DAS
    0.56
     Dtsch
    0.55
    Das
    0.54
     német
    0.54
     daß
    0.52
     šport
    0.52
    よそ
    0.52
    äsident
    0.52
    POSITIVE LOGITS
     onboard
    0.46
     методом
    0.43
     iteratively
    0.42
     iterative
    0.42
     bere
    0.40
     Beans
    0.40
     Contains
    0.39
     Bel
    0.39
     adapt
    0.39
     bel
    0.39
    Act Density 0.040%

    No Known Activations