INDEX
Explanations
phrases related to medical and psychological conditions
New Auto-Interp
Negative Logits
eous
-0.73
smanship
-0.64
urat
-0.61
liness
-0.60
enthal
-0.59
yy
-0.58
pez
-0.58
eem
-0.57
worthy
-0.56
shall
-0.56
POSITIVE LOGITS
avior
0.78
behavioral
0.72
conditioning
0.68
ecology
0.64
biologists
0.61
genetics
0.60
traits
0.59
CHAT
0.58
regression
0.58
behavioural
0.58
Activations Density 4.217%