INDEX
Explanations
references to high-ranking or important figures/entities within a specific context or domain
references to arch rivals or archbishops
New Auto-Interp
Negative Logits
Ô
-0.84
uana
-0.82
arettes
-0.78
leneck
-0.74
Citation
-0.70
ña
-0.69
ACTION
-0.67
ABE
-0.66
OTOS
-0.65
Serve
-0.64
POSITIVE LOGITS
bishop
1.06
ipel
1.04
arch
0.97
di
0.87
itect
0.85
ivist
0.81
ival
0.81
rival
0.81
hum
0.79
arch
0.76
Activations Density 0.008%