INDEX
Explanations
words relating to names
human names or names of individuals related to a context
New Auto-Interp
Negative Logits
staking
-0.85
tails
-0.78
ebted
-0.76
cens
-0.73
Scare
-0.73
conduc
-0.71
ãģ®éŃĶ
-0.70
enegger
-0.69
tailed
-0.68
tail
-0.66
POSITIVE LOGITS
mn
0.76
ng
0.76
rs
0.76
ctr
0.75
xon
0.74
cs
0.72
mt
0.71
mr
0.71
alm
0.71
TG
0.69
Activations Density 0.192%