INDEX
Explanations
instances of the word "on."
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.07
3:0.09
4:0.09
5:0.08
6:0.07
7:0.08
8:0.09
9:0.08
10:0.07
11:0.07
Negative Logits
Sop
-2.04
counter
-1.66
Chev
-1.64
Tab
-1.61
tit
-1.61
Bernard
-1.56
Hollow
-1.56
hash
-1.56
tab
-1.55
Christy
-1.53
POSITIVE LOGITS
DragonMagazine
2.41
omaly
2.02
inav
1.96
eatures
1.93
therap
1.88
today
1.82
Brow
1.80
speech
1.76
millenn
1.75
aditional
1.74
Activations Density 0.000%