INDEX
Explanations
instances of the word "or"
New Auto-Interp
Head Attr Weights
0:0.08
1:0.09
2:0.09
3:0.09
4:0.07
5:0.07
6:0.08
7:0.07
8:0.08
9:0.08
10:0.08
11:0.09
Negative Logits
Corpus
-3.00
�
-2.71
)|
-2.64
GOODMAN
-2.58
AMY
-2.53
finance
-2.49
=>
-2.48
ع
-2.45
=>
-2.43
.","
-2.41
POSITIVE LOGITS
warm
2.72
buck
2.69
Witches
2.63
Wizards
2.57
Slot
2.57
Shank
2.56
basket
2.55
baskets
2.53
Gat
2.51
Gan
2.50
Activations Density 0.000%