INDEX
Explanations
occurrences of specific names or identifiers
New Auto-Interp
Negative Logits
ascript
-0.90
ividual
-0.85
isite
-0.80
raltar
-0.76
ilts
-0.76
erest
-0.75
irlf
-0.75
gerald
-0.73
lections
-0.72
berus
-0.70
POSITIVE LOGITS
Pai
0.93
Dur
0.76
Kumar
0.74
à¦
0.72
ullah
0.72
Rai
0.70
ji
0.70
Chou
0.70
Bang
0.69
Gandhi
0.68
Activations Density 0.018%