INDEX
Explanations
references to a specific individual named Terj
New Auto-Interp
Negative Logits
nia
-0.19
sert
-0.18
tee
-0.17
sb
-0.17
shares
-0.16
XO
-0.16
sm
-0.16
sd
-0.15
sports
-0.15
nett
-0.15
POSITIVE LOGITS
Ter
0.24
rible
0.22
rence
0.21
restrial
0.21
rier
0.20
Ter
0.20
ãĥĨãĥ«
0.19
ter
0.19
reur
0.18
akhir
0.17
Activations Density 0.004%