INDEX
Explanations
phrases related to professional practices or activities
instances of time, duration, and references to practice or performance situations
New Auto-Interp
Negative Logits
kefeller
-0.78
arnaev
-0.71
Detailed
-0.68
osponsors
-0.67
¥ŀ
-0.64
groupon
-0.60
ongyang
-0.60
ãĤ´ãĥ³
-0.58
SourceFile
-0.58
ogly
-0.57
POSITIVE LOGITS
I
1.30
we
1.19
they
1.03
nobody
1.00
you
0.99
yeah
0.96
though
0.95
everybody
0.93
it
0.93
he
0.93
Activations Density 0.255%