INDEX
Explanations
proper nouns
occurrences of the substring "ber."
New Auto-Interp
Negative Logits
URA
-0.64
Orwell
-0.63
pestic
-0.62
awa
-0.61
ENDED
-0.61
Sloven
-0.60
Objective
-0.60
pity
-0.58
ourced
-0.58
Gujar
-0.57
POSITIVE LOGITS
culosis
1.48
deen
1.30
keley
1.24
wald
1.17
netic
1.17
ding
1.04
punk
1.01
lin
0.91
geist
0.90
geon
0.90
Activations Density 0.034%