INDEX
Explanations
instances of the phrase "sign up."
New Auto-Interp
Head Attr Weights
0:0.05
1:0.17
2:0.06
3:0.09
4:0.03
5:0.20
6:0.08
7:0.02
8:0.05
9:0.06
10:0.07
11:0.07
Negative Logits
sandwiches
-1.69
agara
-1.63
pees
-1.56
differentiate
-1.36
magnification
-1.35
gars
-1.35
iPads
-1.34
substant
-1.34
imped
-1.32
monopoly
-1.32
POSITIVE LOGITS
rall
1.80
Ble
1.65
rive
1.58
shenan
1.56
Heart
1.49
enthusi
1.47
Guest
1.47
ertodd
1.46
Cry
1.43
Cra
1.42
Activations Density 0.001%