INDEX
Explanations
phrases indicating connections or relationships, especially involving possession or descriptions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.07
3:0.05
4:0.07
5:0.02
6:0.25
7:0.28
8:0.03
9:0.03
10:0.07
11:0.05
Negative Logits
izons
-1.36
gdala
-1.30
erion
-1.29
orthy
-1.24
fml
-1.23
Opportun
-1.22
Kad
-1.19
dearly
-1.18
nowhere
-1.18
fman
-1.17
POSITIVE LOGITS
anchester
1.59
unda
1.44
iform
1.43
flowers
1.37
ridges
1.34
emetery
1.29
aign
1.29
Flowers
1.24
avior
1.24
demonstrations
1.24
Activations Density 0.001%