INDEX
Explanations
words related to physical feelings or sensations, especially negative ones
occurrences of the substring "fe" within words
New Auto-Interp
Negative Logits
worthy
-0.73
GEAR
-0.69
antically
-0.67
ANGEL
-0.63
stood
-0.62
stretched
-0.62
ova
-0.61
70710
-0.61
Colleges
-0.60
owl
-0.59
POSITIVE LOGITS
eling
1.09
els
0.98
ck
0.95
ffer
0.94
lda
0.93
cking
0.93
cker
0.91
elin
0.90
efe
0.89
ruary
0.88
Activations Density 0.006%