INDEX
Explanations
words and phrases related to evidence and validation of claims
New Auto-Interp
Head Attr Weights
0:0.06
1:0.10
2:0.03
3:0.03
4:0.03
5:0.40
6:0.03
7:0.02
8:0.09
9:0.05
10:0.07
11:0.04
Negative Logits
achine
-1.70
Hen
-1.65
Patri
-1.62
rus
-1.59
xi
-1.57
ula
-1.56
eli
-1.55
Anth
-1.55
Dogs
-1.53
Dog
-1.49
POSITIVE LOGITS
unaccount
2.21
existent
1.80
unlaw
1.80
removable
1.75
rul
1.74
requirement
1.73
notor
1.71
Flavoring
1.67
residual
1.67
functional
1.66
Activations Density 0.538%