INDEX
Explanations
names or references related to the person or character "Halle" with varying activation strengths
occurrences of the word "alle" and similar-sounding variations
New Auto-Interp
Negative Logits
icum
-0.95
ascus
-0.88
essions
-0.81
iaries
-0.79
etheless
-0.79
ocre
-0.77
icient
-0.76
ilon
-0.74
asive
-0.72
aido
-0.72
POSITIVE LOGITS
tto
1.12
WER
0.94
tta
0.94
xual
0.88
phant
0.76
tti
0.73
mann
0.73
bucks
0.71
Sorce
0.70
gre
0.70
Activations Density 0.059%