INDEX
Explanations
phrases and words regarding origins and historical contexts
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.09
3:0.05
4:0.13
5:0.02
6:0.06
7:0.36
8:0.02
9:0.02
10:0.09
11:0.09
Negative Logits
phies
-1.90
aptic
-1.65
iatric
-1.62
luent
-1.58
erning
-1.53
iatrics
-1.50
ooth
-1.49
urances
-1.49
edient
-1.43
yss
-1.42
POSITIVE LOGITS
genesis
1.72
origins
1.47
myth
1.43
rumor
1.39
causation
1.37
myths
1.36
fragment
1.33
miscon
1.32
diver
1.30
strand
1.29
Activations Density 0.009%