INDEX
Explanations
phrases introducing lists or sets of items or actions
phrases that introduce lists or examples
New Auto-Interp
Negative Logits
terness
-0.80
irlf
-0.79
osate
-0.78
tick
-0.73
slaught
-0.72
Canaver
-0.72
opus
-0.72
gow
-0.72
Fed
-0.71
fecture
-0.71
POSITIVE LOGITS
include
1.23
are
1.12
latter
1.11
items
1.11
devices
1.08
types
1.07
entities
1.06
kinds
1.05
relate
1.03
aren
1.03
Activations Density 0.120%