INDEX
Explanations
references to specific decades and historical periods
New Auto-Interp
Negative Logits
-0.74
a
-0.72
2
-0.66
,
-0.63
I
-0.61
↵↵
-0.61
:
-0.61
"
-0.60
9
-0.58
↵
-0.57
POSITIVE LOGITS
Efq
1.40
Houſe
1.22
houſe
1.16
Monfieur
1.16
faſt
1.12
itſelf
1.09
―――――
1.08
^(@)
1.08
pleaſure
1.05
$_(
1.05
Activations Density 0.022%