INDEX
Explanations
mentions of proper nouns or names
repeated mentions of the name "Mar."
New Auto-Interp
Negative Logits
ħĭ
-0.98
éĹĺ
-0.87
culosis
-0.85
é¾įå¥ij士
-0.76
ngth
-0.72
anwhile
-0.71
CRIP
-0.71
mble
-0.67
Chosen
-0.65
=]
-0.64
POSITIVE LOGITS
Mar
0.98
riage
0.96
ital
0.96
ried
0.95
riages
0.93
lene
0.85
ousing
0.85
isco
0.83
azz
0.83
Mar
0.83
Activations Density 0.012%