INDEX
Explanations
commonalities or shared characteristics between people or groups
expressions related to shared beliefs or experiences
New Auto-Interp
Negative Logits
anish
-0.73
aunder
-0.72
cember
-0.70
ī
-0.68
llers
-0.67
²¾
-0.66
icans
-0.65
desper
-0.65
uren
-0.62
iffe
-0.62
POSITIVE LOGITS
similarities
1.00
responsibility
0.84
cro
0.79
responsibilities
0.76
characteristics
0.74
resemb
0.73
interests
0.73
ership
0.69
resemblance
0.69
sentiments
0.68
Activations Density 0.039%