INDEX
Explanations
references to a specific individual, particularly in terms of their influence or achievements
New Auto-Interp
Negative Logits
puter
-0.17
agy
-0.16
Ìģ
-0.15
patial
-0.14
anter
-0.14
atif
-0.14
going
-0.14
iect
-0.14
tml
-0.13
genic
-0.13
POSITIVE LOGITS
ufac
0.21
hattan
0.20
agements
0.17
opause
0.16
ataka
0.16
evi
0.15
agment
0.15
iac
0.15
power
0.15
ifest
0.15
Activations Density 0.107%