INDEX
Explanations
instances of the word "bank" and its variations
New Auto-Interp
Negative Logits
SEP
-0.17
лÑĥ
-0.16
ritz
-0.15
aneous
-0.15
antar
-0.15
atsby
-0.15
outu
-0.14
ÐĶÐļ
-0.14
/fw
-0.14
382
-0.14
POSITIVE LOGITS
isco
0.16
ertest
0.16
iest
0.16
marked
0.16
.fm
0.15
cies
0.15
unter
0.14
ëĿ½
0.14
918
0.14
atest
0.14
Activations Density 0.017%