INDEX
Explanations
concepts related to emotional or psychological distress
New Auto-Interp
Negative Logits
']):
-0.82
AxisAlignment
-0.82
}")
-0.80
"),
-0.70
'],
-0.69
'),
-0.68
']],
-0.68
-0.66
')],
-0.66
"])
-0.66
POSITIVE LOGITS
moveToFirst
0.69
warts
0.62
illiteracy
0.61
PerformLayout
0.60
Pilate
0.57
anyway
0.56
виправивши
0.53
totalSupply
0.52
pyramids
0.52
tubercle
0.51
Activations Density 0.742%