INDEX
Explanations
numeric identifiers or labels
New Auto-Interp
Negative Logits
itſelf
-1.08
myſelf
-1.06
―――――
-1.04
neſs
-0.93
་་
-0.93
wiſe
-0.91
leſs
-0.91
Diſ
-0.91
ſelf
-0.90
pleaſure
-0.89
POSITIVE LOGITS
Q
2.15
Q
2.02
q
1.57
q
1.48
q
1.15
Qs
1.05
Q
1.01
Qantas
0.92
qtype
0.89
Quercus
0.88
Activations Density 0.140%