INDEX
Explanations
sentences that involve confirmation or validation statements
New Auto-Interp
Negative Logits
unread
-0.15
ieten
-0.14
itele
-0.14
Responder
-0.14
anske
-0.14
Sesso
-0.14
================================================
-0.13
Resume
-0.13
IFn
-0.13
pseudo
-0.13
POSITIVE LOGITS
sources
0.16
during
0.15
verted
0.15
During
0.15
201
0.14
however
0.14
isch
0.14
:^
0.14
On
0.14
initial
0.14
Activations Density 0.083%