INDEX
Explanations
historical events and figures related to significant social or political changes
New Auto-Interp
Negative Logits
ocz
-0.15
asley
-0.15
otton
-0.15
uard
-0.15
zman
-0.15
opsis
-0.15
ognito
-0.15
ebi
-0.14
onica
-0.14
rido
-0.14
POSITIVE LOGITS
945
0.17
875
0.15
885
0.15
ÙħÙĨظÙĪØ±
0.15
939
0.15
764
0.15
948
0.15
_readable
0.14
aptop
0.14
reh
0.14
Activations Density 0.559%