INDEX
Explanations
references to individuals and their relationships
New Auto-Interp
Negative Logits
cauſe
-1.09
raiſ
-1.05
purpoſe
-1.03
houſe
-1.01
fevere
-1.00
uſed
-1.00
Racine
-0.99
pleaſure
-0.98
caufe
-0.95
Houſe
-0.95
POSITIVE LOGITS
them
1.20
him
1.06
Them
0.98
THEM
0.96
Him
0.95
Them
0.93
HIM
0.92
us
0.92
Him
0.91
м
0.82
Activations Density 0.127%