INDEX
Explanations
references to "pieces" and their ownership
New Auto-Interp
Negative Logits
795
-0.18
sigh
-0.15
asp
-0.14
datap
-0.14
882
-0.14
ittal
-0.13
olas
-0.13
voir
-0.13
aux
-0.13
622
-0.13
POSITIVE LOGITS
machinery
0.23
legislation
0.22
advice
0.21
furniture
0.21
scenery
0.21
luggage
0.20
footage
0.20
wreckage
0.20
territory
0.19
evidence
0.19
Activations Density 0.141%