INDEX
Explanations
expressions of greeting or goodwill
New Auto-Interp
Negative Logits
meer
-0.19
geber
-0.17
gens
-0.16
eness
-0.16
aven
-0.15
ÑĤÑĮ
-0.15
weed
-0.15
went
-0.15
chen
-0.15
lover
-0.14
POSITIVE LOGITS
ospace
0.18
faith
0.15
vard
0.15
fore
0.15
fun
0.15
Buckley
0.15
aways
0.15
ghost
0.15
shown
0.14
apesh
0.14
Activations Density 0.036%