INDEX
Explanations
references to friendly interactions between individuals or groups
instances of the word "friendly" and its variations in different contexts
New Auto-Interp
Negative Logits
ĸļ
-0.83
rast
-0.83
IGHTS
-0.78
illion
-0.76
stall
-0.75
heed
-0.75
illon
-0.72
krit
-0.72
omore
-0.72
uggage
-0.72
POSITIVE LOGITS
confines
0.97
liest
0.76
bye
0.76
liness
0.74
friendly
0.73
Friendly
0.72
spirits
0.72
ship
0.71
gesture
0.70
lier
0.70
Activations Density 0.032%