INDEX
Explanations
terms related to reality
New Auto-Interp
Negative Logits
xual
-0.83
theless
-0.73
Vaugh
-0.65
ucky
-0.65
Carbuncle
-0.61
indal
-0.60
served
-0.60
anted
-0.59
VC
-0.58
clipboard
-0.58
POSITIVE LOGITS
ignment
1.32
polit
1.20
estate
1.17
isation
1.11
estate
1.11
igning
1.01
izations
0.98
igned
0.96
izable
0.93
DonaldTrump
0.90
Activations Density 3.420%