INDEX
Explanations
the word "of" used in various contexts
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.16
3:0.05
4:0.12
5:0.03
6:0.13
7:0.10
8:0.06
9:0.04
10:0.07
11:0.15
Negative Logits
iri
-1.48
zan
-1.43
Pilgrim
-1.37
Refuge
-1.33
Tale
-1.27
Wolves
-1.26
Pandora
-1.25
ÃÂ
-1.25
ukemia
-1.24
Interstitial
-1.24
POSITIVE LOGITS
deleting
1.45
declass
1.45
typed
1.43
sparing
1.39
scarce
1.39
grep
1.35
afety
1.34
hap
1.32
emoji
1.31
abundantly
1.30
Activations Density 0.002%