INDEX
Explanations
occurrences of the word "this" in the text
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.08
4:0.09
5:0.09
6:0.07
7:0.08
8:0.07
9:0.08
10:0.09
11:0.07
Negative Logits
Polly
-3.11
Peck
-2.99
Corker
-2.96
Pike
-2.82
Lydia
-2.76
Waters
-2.74
Patt
-2.71
Joyce
-2.67
OPA
-2.62
Petersen
-2.62
POSITIVE LOGITS
doms
2.86
fu
2.73
hides
2.65
wastes
2.54
welf
2.50
FactoryReloaded
2.46
cht
2.46
Omn
2.45
deceive
2.43
accur
2.42
Activations Density 0.000%