INDEX
Explanations
instances of the word "of"
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.08
3:0.08
4:0.07
5:0.09
6:0.07
7:0.08
8:0.09
9:0.08
10:0.07
11:0.08
Negative Logits
yout
-2.87
ranch
-2.87
pling
-2.63
interstitial
-2.60
ciples
-2.60
rollers
-2.56
iversal
-2.49
gregation
-2.48
Frames
-2.48
ibl
-2.46
POSITIVE LOGITS
Aires
3.36
Zurich
3.14
Erit
3.07
NEC
3.02
Miliband
2.96
Eliot
2.94
Aust
2.81
Switzerland
2.72
Anat
2.70
Austrian
2.70
Activations Density 0.000%