INDEX
Explanations
mentions or references to civilization and related concepts such as civility and annexation
terms related to civility and civil affairs
New Auto-Interp
Negative Logits
prints
-0.87
horn
-0.75
velength
-0.68
iffe
-0.67
Parallel
-0.66
association
-0.66
hops
-0.66
takeaway
-0.65
-0.65
ploma
-0.65
POSITIVE LOGITS
ility
1.25
ilities
1.04
itas
0.95
vy
0.94
ili
0.86
iencies
0.85
ould
0.83
auld
0.82
itational
0.82
ically
0.81
Activations Density 0.018%