INDEX
Explanations
mentions of civil or civilization-related concepts
New Auto-Interp
Negative Logits
egan
-0.18
book
-0.16
egrator
-0.15
_NATIVE
-0.15
books
-0.14
eva
-0.14
ÎķΤ
-0.14
esis
-0.14
amine
-0.14
å¼ı
-0.14
POSITIVE LOGITS
izational
0.24
mente
0.20
izations
0.19
isations
0.17
antro
0.17
781
0.17
anter
0.17
izing
0.16
اتÛĮ
0.16
izable
0.16
Activations Density 0.014%