INDEX
Explanations
words related to governance, politics, and societal issues
New Auto-Interp
Negative Logits
chwitz
-0.69
andise
-0.68
Indra
-0.67
ADRA
-0.66
reon
-0.65
CLASSIFIED
-0.63
HEAD
-0.61
ONSORED
-0.59
indo
-0.58
hotly
-0.58
POSITIVE LOGITS
(<
1.12
pox
0.99
minded
0.94
azaki
0.83
minded
0.80
folk
0.80
case
0.80
est
0.80
tiny
0.76
entary
0.76
Activations Density 2.234%