INDEX
Explanations
names of political figures
individual letters or single-character variables
New Auto-Interp
Negative Logits
Measure
-0.72
convol
-0.67
subtract
-0.64
dispatcher
-0.64
feder
-0.63
cryst
-0.63
CHO
-0.62
editor
-0.60
depress
-0.59
ept
-0.59
POSITIVE LOGITS
uala
0.94
shi
0.93
icz
0.85
ian
0.84
ius
0.80
ÃŃa
0.79
ny
0.77
itic
0.77
itsch
0.77
nikov
0.75
Activations Density 0.186%