INDEX
Explanations
historical or cultural references related to Indian religious figures and sites
New Auto-Interp
Negative Logits
canf
-0.17
Lanc
-0.16
Aid
-0.16
syn
-0.16
Term
-0.15
impro
-0.14
yx
-0.14
Cain
-0.14
entes
-0.14
Lou
-0.14
POSITIVE LOGITS
lesh
0.28
esh
0.26
appa
0.23
oday
0.23
arsing
0.22
argar
0.22
udev
0.21
igar
0.20
resh
0.20
war
0.20
Activations Density 0.290%