INDEX
Explanations
phrases indicating a strong opinion or belief about a particular topic
instances of the word "the" in various contexts
New Auto-Interp
Negative Logits
é¾įåĸļ士
-0.76
oros
-0.76
acers
-0.74
enjoys
-0.74
these
-0.72
å§«
-0.70
alties
-0.70
66666666
-0.68
essors
-0.68
mares
-0.67
POSITIVE LOGITS
culmination
1.04
first
1.03
kind
0.99
same
0.97
easiest
0.96
type
0.96
earliest
0.94
second
0.92
toughest
0.92
moment
0.91
Activations Density 0.080%