INDEX
Explanations
instances of the word "off" in various contexts
New Auto-Interp
Head Attr Weights
0:0.13
1:0.01
2:0.17
3:0.03
4:0.09
5:0.05
6:0.08
7:0.10
8:0.15
9:0.02
10:0.06
11:0.05
Negative Logits
Divinity
-1.60
Sixth
-1.57
Methodist
-1.45
DM
-1.41
Norn
-1.40
Bonus
-1.40
LV
-1.37
hiro
-1.36
Baptist
-1.33
dearly
-1.31
POSITIVE LOGITS
depending
1.69
imet
1.67
vironment
1.65
depending
1.64
psychiat
1.62
othal
1.62
usterity
1.61
gradation
1.54
wards
1.52
ipolar
1.50
Activations Density 0.001%