INDEX
Explanations
themes related to empathy and personal reflection on societal issues
New Auto-Interp
Negative Logits
regardless
-0.20
alo
-0.16
Mara
-0.16
izu
-0.15
whereby
-0.15
Tep
-0.15
smoothed
-0.15
wording
-0.15
beside
-0.14
Regardless
-0.14
POSITIVE LOGITS
till
0.25
Till
0.20
atleast
0.19
Til
0.17
Apart
0.17
hog
0.17
Credits
0.16
erst
0.16
upto
0.16
etiqu
0.16
Activations Density 3.387%