INDEX
Explanations
references to the concept of "home."
New Auto-Interp
Negative Logits
naire
-0.20
naires
-0.17
ment
-0.16
ÙĨ
-0.16
mentation
-0.15
rome
-0.15
nerg
-0.15
ries
-0.15
making
-0.15
chers
-0.15
POSITIVE LOGITS
coming
0.20
grown
0.19
à¯įà®
0.17
quist
0.15
sapi
0.15
/home
0.15
stead
0.15
court
0.15
ılıp
0.15
icação
0.14
Activations Density 0.073%