INDEX
Explanations
references to neo or neo-related terms, particularly in political or social contexts
New Auto-Interp
Negative Logits
Ĭ
-0.15
Rosenberg
-0.15
idable
-0.15
gree
-0.14
оваÑĢ
-0.13
Transitional
-0.13
rella
-0.13
глÑıд
-0.13
ADATA
-0.13
inan
-0.13
POSITIVE LOGITS
cons
0.26
-Nazi
0.22
-cons
0.22
cons
0.20
CONS
0.19
-lib
0.19
Äįek
0.19
classical
0.18
-class
0.17
ÅĻich
0.17
Activations Density 0.007%