INDEX
Explanations
the word "Horace" with varying levels of activation
references to the name "Horace."
New Auto-Interp
Negative Logits
ħĭ
-0.76
ãģ¦
-0.76
anwhile
-0.72
ãģ®éŃĶ
-0.71
0000000000000000
-0.67
succeeding
-0.66
Leilan
-0.65
enhagen
-0.64
graded
-0.62
conservancy
-0.61
POSITIVE LOGITS
izontal
1.43
izont
1.38
izons
1.27
rible
1.27
ror
1.15
ribly
1.14
izon
1.13
rors
1.09
cru
1.08
oscope
1.06
Activations Density 0.026%