INDEX
Explanations
terms indicating negation or non-existence
New Auto-Interp
Negative Logits
Tycoon
-0.98
Franks
-0.76
ãĤ¼ãĤ¦ãĤ¹
-0.68
Grind
-0.65
hordes
-0.65
Rooms
-0.64
Dug
-0.64
Halls
-0.64
Kings
-0.64
Spoon
-0.62
POSITIVE LOGITS
chal
1.20
stop
1.10
linear
1.07
etheless
1.07
verbal
1.05
epad
1.04
fiction
1.02
threatening
1.02
profit
1.01
zero
1.00
Activations Density 0.018%