INDEX
Explanations
conditional statements and expressions of doubt or regret
New Auto-Interp
Negative Logits
itſelf
-1.02
Anſ
-0.94
pleaſure
-0.93
myſelf
-0.88
Majefty
-0.85
виправивши
-0.84
Jefus
-0.83
fhew
-0.81
enderror
-0.79
anſ
-0.79
POSITIVE LOGITS
so
0.54
for
0.53
COVID
0.52
I
0.50
fact
0.50
COMPONENT
0.49
COVID
0.47
na
0.47
PyErr
0.46
che
0.46
Activations Density 0.014%