INDEX
Explanations
proper nouns
proper nouns and specific names in the text
New Auto-Interp
Negative Logits
enegger
-0.81
anwhile
-0.80
referen
-0.69
destro
-0.68
代
-0.67
chnology
-0.63
ãĥ¼ãĥĨ
-0.60
raints
-0.60
ĸļ
-0.60
mble
-0.60
POSITIVE LOGITS
Coin
0.61
Forge
0.54
Day
0.53
Py
0.53
Hut
0.53
star
0.53
eland
0.53
Temple
0.53
Street
0.52
han
0.52
Activations Density 0.648%