INDEX
Explanations
contractions and possessive forms
expressions related to emotional states or actions of individuals
New Auto-Interp
Negative Logits
oner
-0.64
Composite
-0.64
Examination
-0.64
Classification
-0.63
omsky
-0.62
Denis
-0.61
vision
-0.60
Nanto
-0.60
detail
-0.59
lear
-0.59
POSITIVE LOGITS
inevitably
0.72
mattered
0.72
idle
0.71
inconvenient
0.68
peril
0.68
shine
0.67
darkest
0.67
hottest
0.66
malf
0.66
wcsstore
0.66
Activations Density 0.271%