INDEX
Explanations
expressions related to belonging or inclusion
expressions of belonging or inclusion within a group or community
New Auto-Interp
Negative Logits
torped
-0.72
gently
-0.68
calves
-0.66
berus
-0.65
stasy
-0.65
ceilings
-0.65
Recommend
-0.64
ricanes
-0.64
bolts
-0.63
bapt
-0.63
POSITIVE LOGITS
ridge
0.82
ICLE
0.81
icle
0.78
aking
0.78
ner
0.76
icular
0.75
ners
0.74
ials
0.72
ioned
0.71
iator
0.69
Activations Density 0.031%