INDEX
Explanations
words related to actions or processes indicated by the suffix "-ings."
words or concepts related to vague or abstract classifications
New Auto-Interp
Negative Logits
recogn
-0.69
URRENT
-0.67
dis
-0.65
saline
-0.64
SIGN
-0.63
thin
-0.62
UTE
-0.62
isol
-0.61
sport
-0.61
renewable
-0.60
POSITIVE LOGITS
hots
1.21
ings
1.20
manship
1.08
omething
1.03
poons
1.01
quartered
1.00
hower
0.99
peed
0.96
poon
0.95
furt
0.92
Activations Density 0.014%