INDEX
Explanations
adjective-noun phrases
phrases that refer to specific groups of people
New Auto-Interp
Negative Logits
Merit
-0.63
Shake
-0.59
Rescue
-0.59
iven
-0.59
Poké
-0.59
yours
-0.58
Adventures
-0.58
Lou
-0.57
courtesy
-0.56
Buster
-0.56
POSITIVE LOGITS
iris
0.85
oppose
0.83
contemplate
0.82
perpetuate
0.81
rir
0.80
kie
0.79
creen
0.78
benefited
0.78
partake
0.76
paces
0.75
Activations Density 0.121%