INDEX
Explanations
mentions of the New England Patriots football team
New Auto-Interp
Negative Logits
onas
-0.15
oref
-0.15
mates
-0.14
ALI
-0.14
ypse
-0.14
Král
-0.14
ucene
-0.14
ntax
-0.14
ipeg
-0.14
ibold
-0.13
POSITIVE LOGITS
911
0.17
ussia
0.15
rays
0.15
zsche
0.14
141
0.14
780
0.14
FO
0.14
rollers
0.14
915
0.14
MID
0.14
Activations Density 0.002%