INDEX
Explanations
words related to legislation, history, and current events
New Auto-Interp
Negative Logits
esta
-0.62
prus
-0.61
paradise
-0.60
istor
-0.59
agra
-0.58
rame
-0.56
ode
-0.55
stew
-0.55
itta
-0.54
dormant
-0.54
POSITIVE LOGITS
whatsoever
0.83
_.
0.82
imaginable
0.75
thereafter
0.72
intervals
0.72
..................
0.72
enance
0.71
srfAttach
0.71
ILCS
0.70
lust
0.68
Activations Density 0.494%