INDEX
Explanations
names of people
references to authors and creators of works
New Auto-Interp
Negative Logits
iasm
-0.84
zers
-0.71
thood
-0.70
boxing
-0.70
weak
-0.66
addons
-0.64
udging
-0.63
activate
-0.63
vin
-0.63
wards
-0.62
POSITIVE LOGITS
Jr
0.94
Architects
0.92
QC
0.91
PhD
0.88
CFR
0.83
aka
0.82
who
0.78
whose
0.77
ISBN
0.72
LLC
0.72
Activations Density 0.239%