INDEX
Explanations
references to household items and activities associated with comfort or leisure
New Auto-Interp
Negative Logits
ppo
-0.15
/Foundation
-0.14
ocab
-0.14
onso
-0.14
fragmentation
-0.13
byn
-0.13
Yine
-0.13
Refugee
-0.13
TypeDef
-0.12
tiener
-0.12
POSITIVE LOGITS
CRET
0.17
ugen
0.16
æĽľæĹ¥
0.16
Hung
0.16
.WaitFor
0.15
emet
0.14
rose
0.14
met
0.14
óst
0.14
egers
0.14
Activations Density 0.151%