INDEX
Explanations
phrases that denote functions or roles of entities
New Auto-Interp
Negative Logits
aping
-0.77
代
-0.75
hess
-0.74
gling
-0.69
differed
-0.69
ヴ
-0.68
better
-0.66
apes
-0.65
gins
-0.65
respective
-0.64
POSITIVE LOGITS
scrimmage
0.77
bail
0.76
proxy
0.72
bedrock
0.72
propulsion
0.71
biotech
0.69
Primal
0.68
volunt
0.66
aggro
0.65
foundations
0.65
Activations Density 0.053%