INDEX
Explanations
names of people or locations
names and terms related to individuals, particularly in a legal or news context
New Auto-Interp
Negative Logits
interstitial
-0.84
intermedi
-0.83
INF
-0.81
Malone
-0.73
364
-0.72
Instr
-0.71
agascar
-0.69
135
-0.69
Alexis
-0.68
244
-0.67
POSITIVE LOGITS
w
1.24
W
1.15
WM
1.12
wi
1.08
wit
1.05
WC
1.03
wl
1.02
wa
1.02
wt
1.02
wal
1.02
Activations Density 0.208%