INDEX
Explanations
phrases indicating personal reflections or experiences
New Auto-Interp
Negative Logits
Coverage
-0.78
pestic
-0.71
Workers
-0.71
Poster
-0.70
Officials
-0.69
Vest
-0.67
Protective
-0.65
ILCS
-0.64
«ĺ
-0.63
Verify
-0.63
POSITIVE LOGITS
seemed
1.16
fascinated
1.16
intrigued
1.08
bothered
1.08
reminded
1.08
dawn
1.06
bothers
1.05
sucked
1.03
annoy
1.02
myself
1.02
Activations Density 0.385%