INDEX
Explanations
instances of the letter 't'
New Auto-Interp
Head Attr Weights
0:0.04
1:0.03
2:0.09
3:0.28
4:0.08
5:0.04
6:0.04
7:0.05
8:0.06
9:0.07
10:0.09
11:0.08
Negative Logits
whoever
-1.72
Meow
-1.41
Lilly
-1.39
wherever
-1.38
motto
-1.33
whenever
-1.33
Amendments
-1.33
Elaine
-1.33
udos
-1.29
Nicola
-1.29
POSITIVE LOGITS
lag
1.62
arching
1.57
alm
1.54
alties
1.53
mble
1.51
yrus
1.50
atro
1.44
icka
1.43
ć
1.42
asin
1.42
Activations Density 0.000%