INDEX
Explanations
occurrences of the word "Office"
New Auto-Interp
Negative Logits
oise
-0.71
udeau
-0.68
nir
-0.67
Kaine
-0.67
ipe
-0.65
artifacts
-0.65
theless
-0.65
inging
-0.64
isers
-0.64
hered
-0.64
POSITIVE LOGITS
Depot
0.88
365
0.87
gur
0.87
Office
0.76
Personnel
0.76
Desk
0.73
Office
0.69
person
0.69
Sphere
0.67
XIII
0.66
Activations Density 0.017%