INDEX
Explanations
references to artistic works or tangible items
New Auto-Interp
Negative Logits
795
-0.15
sigh
-0.15
hugs
-0.14
ulty
-0.14
rrha
-0.14
asca
-0.14
spreads
-0.14
974
-0.14
184
-0.13
063
-0.13
POSITIVE LOGITS
legislation
0.23
machinery
0.23
advice
0.21
furniture
0.20
luggage
0.20
evidence
0.19
equipment
0.19
artwork
0.19
Legislation
0.18
Americ
0.17
Activations Density 0.072%