INDEX
Explanations
terms related to spatial or infrastructural changes and improvements
New Auto-Interp
Negative Logits
T
-1.06
S
-1.05
J
-1.03
E
-1.01
E
-0.99
K
-0.98
P
-0.96
O
-0.95
G
-0.95
H
-0.94
POSITIVE LOGITS
myſelf
1.98
themſelves
1.93
itſelf
1.89
Theſe
1.81
himſelf
1.78
Anſ
1.75
ſelves
1.75
ſever
1.75
pleaſure
1.75
Diſ
1.71
Activations Density 0.646%