INDEX
Explanations
verbs that indicate action or activity
statements that describe the existence or presence of various concepts and elements
New Auto-Interp
Negative Logits
obook
-0.74
ocracy
-0.70
esis
-0.68
unfocusedRange
-0.63
ipop
-0.63
aroo
-0.62
ographer
-0.62
phas
-0.62
phabet
-0.59
ropolis
-0.58
POSITIVE LOGITS
respectively
1.56
alike
1.26
jointly
1.17
mutually
1.13
examples
1.02
trademarks
1.00
intertwined
0.99
insepar
0.99
fronts
0.96
both
0.94
Activations Density 0.294%