INDEX
Explanations
references to individuals and their actions or experiences
New Auto-Interp
Negative Logits
itſelf
-1.02
Efq
-0.97
Diſ
-0.94
Majefty
-0.91
Reſ
-0.89
Chriftian
-0.87
Theſe
-0.86
Monfieur
-0.85
houſe
-0.84
Jefus
-0.84
POSITIVE LOGITS
را
0.82
音を
0.76
को
0.75
ನ್ನು
0.73
MENAFN
0.71
名を
0.71
devamını
0.71
ceğini
0.68
みを
0.67
த்தை
0.67
Activations Density 0.058%