INDEX
Explanations
words related to society, community, and personal life
references to social issues and their impact on individuals' lives
New Auto-Interp
Negative Logits
urat
-0.75
POR
-0.73
escription
-0.69
Sov
-0.65
iago
-0.62
Sisters
-0.61
anche
-0.61
rib
-0.60
itars
-0.58
urances
-0.57
POSITIVE LOGITS
advertising
0.74
manship
0.73
âĢķ
0.72
circles
0.72
Indies
0.71
hierarchy
0.69
attRot
0.66
contexts
0.66
exponentially
0.65
vicinity
0.63
Activations Density 0.440%