INDEX
Explanations
phrases and quantifiers related to time and frequency
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.12
3:0.05
4:0.37
5:0.02
6:0.08
7:0.05
8:0.04
9:0.03
10:0.08
11:0.07
Negative Logits
represented
-1.67
existence
-1.61
iage
-1.48
edom
-1.47
rollment
-1.43
oka
-1.43
rious
-1.40
orer
-1.35
oppable
-1.32
metadata
-1.28
POSITIVE LOGITS
horm
1.99
wisely
1.55
wink
1.54
Laughs
1.50
disguise
1.49
��
1.46
istg
1.43
anyway
1.43
accordingly
1.42
beforehand
1.41
Activations Density 0.160%