INDEX
Explanations
proper nouns related to people's names
mentions of the name "Ar"
New Auto-Interp
Negative Logits
ĸļ
-0.87
¬¼
-0.79
stakes
-0.77
eners
-0.74
ãģį
-0.74
hower
-0.69
ĨĴ
-0.68
iculty
-0.65
sylvania
-0.63
sterling
-0.62
POSITIVE LOGITS
beit
1.15
ansas
1.09
issa
1.08
ithmetic
1.06
rival
1.01
thritis
0.98
leigh
0.98
lington
0.97
rington
0.95
ranging
0.95
Activations Density 0.023%