INDEX
Explanations
instances of meetings and interactions
New Auto-Interp
Negative Logits
secut
-0.71
container
-0.62
duc
-0.61
membranes
-0.60
untreated
-0.60
ocker
-0.60
secution
-0.60
wards
-0.60
commentary
-0.60
vernment
-0.59
POSITIVE LOGITS
amorph
1.28
ropolis
0.98
agame
0.93
tle
0.90
ered
0.78
ioned
0.75
atron
0.74
onym
0.74
ups
0.74
allic
0.73
Activations Density 0.611%