INDEX
Explanations
proper nouns
instances of the name "Ari."
New Auto-Interp
Negative Logits
ding
-0.72
eners
-0.66
enegger
-0.66
irk
-0.65
limited
-0.64
FORMATION
-0.63
WARD
-0.62
mit
-0.62
Fargo
-0.61
Confederacy
-0.60
POSITIVE LOGITS
zon
1.02
jit
0.93
zona
0.92
zeb
0.91
leans
0.88
Ģ
0.87
ī
0.87
usha
0.86
asing
0.84
ija
0.83
Activations Density 0.042%