INDEX
Explanations
symbols and punctuation marks, particularly at the end or in referencing contexts
References in brackets
legal citations and quotations
New Auto-Interp
Negative Logits
myſelf
-1.06
pleaſure
-1.04
houſe
-1.04
itſelf
-1.00
Reſ
-0.96
fubject
-0.96
ſeveral
-0.95
ſind
-0.94
greateſt
-0.93
purpoſe
-0.93
POSITIVE LOGITS
↵↵
1.18
↵↵↵
0.86
↵↵↵↵
0.73
↵
0.71
The
0.65
↵↵↵↵↵↵
0.62
↵↵↵↵↵
0.61
)
0.59
or
0.58
).
0.58
Activations Density 0.569%