INDEX
Explanations
expressions relating to societal issues and problems
New Auto-Interp
Negative Logits
culosis
-0.81
oire
-0.78
oyer
-0.78
osate
-0.77
etheless
-0.76
enium
-0.75
icism
-0.73
leneck
-0.73
raper
-0.72
icka
-0.70
POSITIVE LOGITS
constants
0.89
types
0.87
truths
0.86
examples
0.86
mutually
0.83
moments
0.79
topics
0.78
reminders
0.77
acron
0.75
metaphors
0.75
Activations Density 0.051%