INDEX
Explanations
specific times and locations mentioned in the text
New Auto-Interp
Negative Logits
advoc
-0.66
isphere
-0.60
Connector
-0.58
informants
-0.58
prescriptions
-0.58
straight
-0.58
offending
-0.56
casting
-0.56
selves
-0.55
biased
-0.55
POSITIVE LOGITS
mosp
1.07
las
1.01
hens
0.95
onement
0.93
least
0.93
letico
0.90
mega
0.87
abase
0.84
yp
0.83
raz
0.83
Activations Density 0.035%