INDEX
Explanations
phrases indicating possession or relationships
New Auto-Interp
Negative Logits
Majefty
-1.51
itſelf
-1.44
himſelf
-1.42
myſelf
-1.41
fubject
-1.25
Monfieur
-1.25
themſelves
-1.22
Efq
-1.22
Houſe
-1.21
neceff
-1.20
POSITIVE LOGITS
a
0.95
the
0.91
time
0.83
“
0.81
0.81
Time
0.70
The
0.70
an
0.69
difficult
0.68
M
0.67
Activations Density 0.216%