INDEX
Explanations
words related to people's names or references to organizations
occurrences of the segment "pe"
New Auto-Interp
Negative Logits
é¾
-0.93
GOODMAN
-0.74
microscope
-0.68
swirling
-0.63
Kirin
-0.63
simultane
-0.62
enegger
-0.62
nets
-0.62
calculus
-0.60
ãĤ¨ãĥ«
-0.60
POSITIVE LOGITS
cific
1.10
anut
1.03
oples
1.00
uberty
0.99
asant
0.97
ller
0.97
rer
0.96
cies
0.92
rers
0.90
cial
0.88
Activations Density 0.012%