INDEX
Explanations
personal reflections and expressions of emotional distress
references to specific subjects denoted by "this" or "these"
New Auto-Interp
Negative Logits
ãĤ¹ãĥĪ
-0.99
Īè
-0.86
Ĭ±
-0.83
ãĤ¶
-0.78
ãĥĭ
-0.75
ARS
-0.73
Ľ
-0.72
·
-0.72
ivas
-0.71
©¶æ¥µ
-0.70
POSITIVE LOGITS
guy
0.95
kind
0.84
morning
0.80
sort
0.79
week
0.77
sucker
0.76
stuff
0.76
country
0.74
secrecy
0.71
wonderful
0.71
Activations Density 0.226%