INDEX
Explanations
mentions of specific individuals
the word "then" used in a temporal context
New Auto-Interp
Negative Logits
toe
-0.71
illon
-0.68
sty
-0.60
triangles
-0.60
fulfillment
-0.59
upuncture
-0.59
azor
-0.59
acts
-0.58
reservation
-0.58
mson
-0.58
POSITIVE LOGITS
-'
0.80
Yugoslav
0.78
Yugoslavia
0.73
oslov
0.68
unsuccessfully
0.68
unpublished
0.68
————————————————
0.67
adays
0.67
ceed
0.66
nih
0.65
Activations Density 0.042%