INDEX
Explanations
mention of the word "simple" and its variations related to concepts and processes
New Auto-Interp
Negative Logits
simply
-0.22
simplement
-0.21
Simply
-0.20
simple
-0.19
Simply
-0.19
Simple
-0.18
_simple
-0.18
einfach
-0.18
simpl
-0.18
simplified
-0.17
POSITIVE LOGITS
ton
0.43
tons
0.43
xes
0.35
-minded
0.33
TON
0.30
minded
0.29
ctic
0.27
yet
0.26
/basic
0.24
/plain
0.23
Activations Density 0.040%