INDEX
Explanations
future events or plans
the word "be" in various contexts
New Auto-Interp
Negative Logits
sqor
-0.74
Stain
-0.66
pedia
-0.66
Moose
-0.65
vas
-0.64
partName
-0.63
oso
-0.61
Hes
-0.61
Newspaper
-0.61
Lines
-0.60
POSITIVE LOGITS
replaced
1.06
able
0.96
heading
0.96
evaluated
0.95
ige
0.95
considered
0.92
judged
0.92
released
0.92
seen
0.90
phased
0.88
Activations Density 0.072%