INDEX
Explanations
instances of the word "this."
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.11
3:0.06
4:0.11
5:0.02
6:0.17
7:0.31
8:0.03
9:0.04
10:0.04
11:0.03
Negative Logits
imity
-1.66
�
-1.57
paio
-1.54
sonian
-1.54
iatric
-1.54
edom
-1.52
reach
-1.50
urdue
-1.44
showc
-1.43
ctors
-1.42
POSITIVE LOGITS
00200000
1.62
PsyNet
1.54
Printed
1.45
Change
1.45
Variant
1.40
Either
1.39
Carbuncle
1.35
Happ
1.34
Neg
1.34
Remastered
1.32
Activations Density 0.001%