INDEX
Explanations
phrases describing a quantity or a specific group of items
phrases that refer to categories, instances, or classes of items or experiences
New Auto-Interp
Negative Logits
abus
-0.80
osate
-0.77
Spoiler
-0.74
uberty
-0.71
endon
-0.71
slaught
-0.70
Canaver
-0.69
obook
-0.69
rium
-0.69
isconsin
-0.69
POSITIVE LOGITS
things
1.05
items
1.00
types
0.99
kinds
0.99
pesky
0.96
occasions
0.95
entities
0.95
cases
0.95
situations
0.94
categories
0.94
Activations Density 0.109%