INDEX
    Explanations

    references to specific items or subjects mentioned in the text

    New Auto-Interp
    Negative Logits
     Ceramby
    -0.53
     more
    -0.53
    mazoo
    -0.50
    どころ
    -0.50
     outside
    -0.50
    espan
    -0.49
     consider
    -0.47
    voll
    -0.47
    しか
    -0.46
    dziew
    -0.45
    POSITIVE LOGITS
     this
    1.44
    this
    1.43
    هذه
    1.39
    THIS
    1.37
     dieses
    1.33
     THIS
    1.32
    these
    1.31
     este
    1.31
    1.30
    This
    1.28
    Act Density 0.168%

    No Known Activations