INDEX
Explanations
phrases describing relationships or interactions
prepositions indicating relationships or positions
New Auto-Interp
Negative Logits
arch
-0.64
OV
-0.63
zona
-0.54
TBA
-0.54
addafi
-0.54
antly
-0.53
hare
-0.52
hoff
-0.52
apolis
-0.52
bucks
-0.52
POSITIVE LOGITS
which
2.50
which
2.03
Which
1.79
whom
1.65
Which
1.62
whose
1.46
whence
1.44
wherein
1.31
whose
1.22
whereby
1.12
Activations Density 0.909%