INDEX
Explanations
references to the specific term "Hol" or "Hol" followed by another word
New Auto-Interp
Negative Logits
Lowell
-0.70
ITAL
-0.68
EED
-0.66
REE
-0.66
Pool
-0.64
Strauss
-0.61
responsible
-0.60
Elon
-0.59
inhibitor
-0.58
inhibitors
-0.58
POSITIVE LOGITS
ocaust
1.46
idays
1.40
iday
1.36
ocene
1.34
oho
1.06
isky
1.03
stery
1.03
comb
1.00
ophon
0.99
mberg
0.98
Activations Density 0.035%