INDEX
Explanations
phrases related to sharing information or content with others
instances of sharing information or experiences with others
New Auto-Interp
Negative Logits
vacc
-0.72
venant
-0.68
itures
-0.66
orah
-0.65
ucci
-0.64
ischer
-0.62
hemat
-0.61
excluding
-0.61
vein
-0.60
zanne
-0.60
POSITIVE LOGITS
regards
1.14
regard
1.06
stood
1.04
draw
0.88
respect
0.87
coworkers
0.86
fellow
0.84
dignity
0.83
impunity
0.83
us
0.80
Activations Density 0.082%