INDEX
Explanations
instances of phrases structured as "people in the (specific location or context)"
repeated mentions of the word "the."
New Auto-Interp
Negative Logits
LOCK
-0.80
BIL
-0.76
Iterator
-0.73
whilst
-0.71
EVA
-0.68
VPN
-0.67
instead
-0.67
lessly
-0.67
anew
-0.67
again
-0.66
POSITIVE LOGITS
vicinity
1.04
same
1.00
country
0.92
latter
0.88
periphery
0.87
slightest
0.86
nation
0.85
world
0.82
region
0.80
highest
0.80
Activations Density 0.402%