INDEX
Explanations
plural nouns and their associated forms in the text
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.20
3:0.08
4:0.09
5:0.06
6:0.15
7:0.06
8:0.06
9:0.03
10:0.09
11:0.08
Negative Logits
loophole
-1.55
fallacy
-1.50
iod
-1.41
lightsaber
-1.39
silence
-1.38
pow
-1.38
gust
-1.36
omission
-1.35
Suzuki
-1.33
brakes
-1.33
POSITIVE LOGITS
insured
1.96
excluding
1.74
inion
1.73
wealth
1.67
uliffe
1.64
interrupted
1.62
anguard
1.60
etheless
1.59
ngth
1.58
mort
1.57
Activations Density 0.011%