INDEX
Explanations
names and terms related to individuals or organizations
New Auto-Interp
Negative Logits
faint
-0.73
hyde
-0.69
congr
-0.67
lapse
-0.66
inaug
-0.65
accent
-0.64
Recomm
-0.63
subt
-0.61
Samar
-0.60
REDACTED
-0.59
POSITIVE LOGITS
hett
0.90
idon
0.90
cheon
0.85
tle
0.82
itzer
0.81
oslav
0.76
aughters
0.75
hemy
0.74
ifix
0.73
cross
0.73
Activations Density 1.313%