INDEX
    Explanations

    numerical sequences in square brackets

    numerical data or statistics

    New Auto-Interp
    Negative Logits
     seiz
    -0.68
     laure
    -0.68
     agre
    -0.63
    milo
    -0.63
     pen
    -0.61
     este
    -0.60
     wagon
    -0.58
     sle
    -0.58
     mustache
    -0.58
     bung
    -0.58
    POSITIVE LOGITS
    ][
    1.30
    ].
    1.19
    ]).
    1.19
    ],[
    1.14
    ]),
    1.10
    ]
    1.06
    ],
    1.05
    ]:
    1.03
    ]);
    0.96
    ]"
    0.94
    Act Density 0.029%

    No Known Activations