INDEX
Explanations
references to intellectual property and patent-related terminology
New Auto-Interp
Negative Logits
-
-0.52
'
-0.49
Pure
-0.47
sist
-0.46
I
-0.46
...
-0.46
S
-0.45
.-
-0.45
...
-0.45
{--0.45
POSITIVE LOGITS
purpoſe
0.93
Jefus
0.90
ſche
0.89
ſeveral
0.88
Theſe
0.88
pleaſure
0.88
itſelf
0.88
becauſe
0.86
ſind
0.86
ſtate
0.85
Activations Density 0.447%