INDEX
Explanations
references to Berkshire and Berkeley
New Auto-Interp
Negative Logits
rawer
-0.19
olson
-0.15
tery
-0.14
gers
-0.14
inson
-0.14
اÙĪÙĨد
-0.14
unner
-0.14
沿
-0.14
Secondary
-0.14
ذ
-0.14
POSITIVE LOGITS
shire
0.23
ley
0.20
LEY
0.19
lee
0.16
adir
0.16
Hath
0.16
zee
0.16
elper
0.15
ades
0.15
RITE
0.15
Activations Density 0.009%