INDEX
Explanations
references to society or social terms
references to social structures or organizations
New Auto-Interp
Negative Logits
fluorescent
-0.72
knocking
-0.67
stump
-0.65
ultrasound
-0.62
vomit
-0.62
recall
-0.62
blast
-0.60
lamb
-0.60
++++
-0.60
invasive
-0.60
POSITIVE LOGITS
ieties
1.53
iety
1.52
ietal
1.48
keye
1.29
cer
1.24
iet
1.19
ionics
1.05
ivil
0.97
io
0.92
ethe
0.90
Activations Density 0.027%