INDEX
Explanations
email addresses
technical terms and jargon
New Auto-Interp
Negative Logits
Instr
-0.68
earnest
-0.64
appe
-0.63
Barrett
-0.61
Buddy
-0.60
Pric
-0.60
substitute
-0.59
accountability
-0.59
Narr
-0.58
Amnesty
-0.58
POSITIVE LOGITS
isoft
1.12
chwitz
1.02
ircraft
0.98
mble
0.95
ertain
0.95
redients
0.95
cean
0.92
mosp
0.92
duino
0.91
phal
0.91
Activations Density 0.097%